June 17, 2014

Fluturas 22 Non Statistical Questions for a Statistician!

Flutura has always believed that when the world of business collides with the world of math, magic unfolds. As these 2 worlds collide, it also presents a set of unique challenges - bridging the semantic language gap between business and math. Modelling complex business outcomes using math requires an interdisciplinary team consisting of business folks, data folks and math folks. While doing so business folks are always at a loss because a language chasm exists. Math folks love their ”geek speak” ( Tanimoto coefficient, chi square, odds ratio) and business folks are focussed on impactful outcomes (Mean time between failure, Next best action etc.).
Having been caught in between, we at Flutura we have been obsessed with the question - How do we bridge the world of business to the world of math?

Our data scientists have come up with an ultra-specific checklist of 22 questions  to lubricate the friction which exists and would like to share with the world . So here they come in no particular order …

1.       Which business outcome are we attempting to model and predict?
·         For ex : Mean time between failure of asset - MTBF, Next best action- NBA
2.       What surgical actions can we drive once we are able to predict the outcome?
·         For ex : preventive replacement of asset, stock up on spares
3.       What is the impact of these actions?
·         For ex:  Reduced down time, minimized risk
4.       What is the economic impact of a correct prediction?
·          For ex : cost of reduced downtime translated into $
5.       What is the economic impact of a wrong prediction (false positives)?
·         For example : $ spent which goes into replacing a healthy asset
6.       What is the non-economic impact of a wrong prediction?
·         For ex:  Customer experience gets compromised
7.       Is not predicting a bearable business option?
·         For example : In some industries gut feel accuracy is still a workable solution ( but these are very rare )
8.       Is the business phenomena we are trying to predict modellable using the data we have?
·         For example : Ambient temperature may not be instrumented to model downtime
9.       Is there enough breadth available in data to explain the predicted behaviour?
·         For example : Certain asset attributes like tenure may not be available
10.   Is there enough depth available in the data we are using?
·         For example : 2 years, 3 years , 4 years
11.   Are there blind spots in causal data?
·         For example - Ambient context? Human context, Machine context?
12.   Is the past representative of the future?
·         For example – A new design may invalidate historical data of an asset
13.   Are we modelling black swan events?
·          For example - Black swan events are rare difficult to predict event
14.   Do rhythms and patterns exist in historical data which is correlated to outcome?
·         For example are their frequent sequences of event prior to asset breaking down ?
15.   Was the modelling an armchair exercise or did the modeller soak in the biz process?
·         For example – Modellers who are intimate contact with field can model assets behaviour better
16.   Are we focussing only on signals which reinforce our world view?
·         For example – Typically people have Cognitive bias
17.   Which real world behaviours are encoded in vectors?
·         For example - volatility, velocity, dispersions, ranks
18.   Does the statistical model articulate a range of possible business outcomes?
·         For example – best case scenario MTBF = 768 days, Worst case = 628 days
19.   Does the statistical model articulate the realistic outcome?
·         For example – realistic MTBF = 680 days
20.   Are their weak signals if triangulated which could become a strong signal?
·         For example – combining vibrations + experience of maintenance engineer + asset age
21.   Are we mistaking correlation for causality?
·         For example – Vibration frequency is co-relation whereas maintenance engineer could be causal
22.   Have we polled multiple models to see if 2 models reinforce the same outcome ?
·         For example does logistic regression reinforce the outcome received from a decision tree ?

So go ahead, ask these 22 questions to the models you have in place. Hope it minimizes the geek vs business chasm and fosters a share meaning of the world!  Please do share with us your experiences and a beer on Flutura for the most insightful feedback J