April 13, 2012

Insurance Big Data - Harvesting causal predictors using call center text mining

While teasing out predictors for non renewal of an insurance policy in Insurance sector we investigate for clues for Policy surrender from a subset of call centre transcripts (subsample of 90 day conversations where an outbound CSR records the key points of conversation regarding a policy holders reasons for not subscribing. Since the volume of conversations is in the thousands its difficult to manually These conversations can be fed thru the unstructured text mining process which can extract the top 10 themes. Example we can track the frequency of occurrence of certain “WATCH LIST” keywords where a sudden increase in keyword frequency/themes like ‘ POOR SERVICE ‘, “PREMIUM AMOUNT” can signal a subscribers intent to not renew the policy

Overlaying Text mining over Behavioral Segmentation

While customer segmentation works on structured behavioral data to give us into a clue of the customer behavior, unstructured text mining can give us a clue as to what the customer actually feels about his or her experience. For example in one example in a recent exercise after segmentation of customer behavior they overlaid the key themes emerging from sentiment analysis by text mining of inbound customer calls to 1-800 numbers on top of it. This allowed the business to co-relate customer behavior with the underlying themes in conversations. For example the top 5 keywords which were used frequently by each behavioral segment can reveal a lot about the service levels / product coverage which influence the behavior reflected in the segment classifications. One can also discern if the migration of high value customers had anything to do with the increase in certain watch keywords in customer conversations. So consider combining text mining along with segmentation

Insurance Big Data - Predictor Harvesting

Increasingly in, insurance, retail, telecom and banking industry, the ability to accurately pinpoint customers who have a greater chance of exhibiting certain behavior and proactively put interventions in place is becoming a competitor differentiator. One of the most important tasks in trying to model customer behavior is to precisely zero in on predictors which influence a behavioral outcome using advanced analytical models. For example in Insurance industry which are the top 5 influence levers which determine whether a policy holder would surrender his policy

One lever which organisations use is "FIELD IMMERSION". Basically In field “immersion”, the behavioural modelling/investigation team spends a day in the life of an agent /customer trying to get direct experience of the factors at play while an agent is trying to persuade an existing subscriber to hold on to his/her policy. The rational for direct “customer immersion” is based on the fact that direct observations of these interactions can give us a window into workings of agent-customer dynamics and yield new insights as to what causes a policy holder to renew a policy. This can them be statistically modeled into the policy scoring model provided the data exists. Also thru field immersion a behavioral investigator gets access to nuances /behavioral dynamics at play which we may not be privy to in other methods. For example while observing the agent-customer interaction during a field visit led us to hypothesize that agents had a greater influence over policy holder if the agent-customer relationship was > 3 years and the customer’s age was greater than 47 years. This was statistically validated by a simple discriminant analysis and hypothesis testing and finally fed as an input to the scoring process. This was one practical example of a modelling variable received from field immersion whose statistical test of significance (chi square value) was better than those received from hypothesis workshops. Similarly in another interaction it was observed that in certain zip codes, there was a unique word of mouth referral campaign being executed which was causing policy holders to surrender. This was an instance where data to model this phenomenon was not readily available and the modeling team brainstormed with the customer team to institutionalize a new data collection process to capture details about competitive campaigns which can fuel surrenders

April 5, 2012

Telecom : 4 Steps to build a Network Graph in Hadoop

Dicerning relationships between customers and the underlying patterns can be a huge source of intelligence for smart marketing activities. Network Link Analysis offers a construct to decode density of interactions and patterns. Network Link Analysis can be done by analyzing call behavior data obtained from CDR switch data. There are 4 essential steps in doing this which are outlined below


Extract CDR information and summarize it for each unique combination of caller and called number


For each caller and called number, count the frequency of calls made, the number of smses sent, the number of prime time calls etc


Use this model to develop call behavorial profiles to target . Example : Friends and families program

Step-4 :

Interrogate the social network database for specific behavior . For example : Who are my existing customers who make more than 20 % of calls to competitive networks during peak time and either the call duration for day exceeds 90 minutes or number of calls per day exceeds 12 ?

Can we target them with a ‘friends and families’ scheme to bring their most frequent called numbers into our network fold and incentivise them in the process

This is just the tip of the iceberg in terms of leveraging Big Data Network link analysis or graph analysis to impact ARPU ( Average revenue per user ) in the telecom industry

Extracting technician intelligence, Optimizing warranty costs in Auto

Auto industry is one of the prime movers of a country economy. In the auto industry, it’s been estimated that warranty costs automotive companies more than $35 billion in the US annually. That’s a huge amount considering the tough economic conditions most companies are operating in.Considering this tough environment, it’s imperative that Auto companies explore all opportunities for reducing costs. Optimizing warranty cost is a very important lever in the cost equation for automobile manufacturers. If one is able to get even a marginal improvement in money spent in warranty cost, it can have a multiplier effect on the overall bottom line. One of the most underutilized dimensions to optimizing warranty cost is inputs from service technician’s comments. The intelligence embedded can be extracted and acted upon

Its very important to formulate some of the business questions which can be answered by the text mining process. Here are a few indicative business questions
1.Which are the prominent problem areas to be concentrated upon at individual dealer levels based on comments from the technicians?
2.Which are the top 5 Car components mentioned in terms of frequency of occurrence in service comments in the last 3 months and what does that tell us about suppliers and/or internal manufacturing processes?
3.Is their seasonality to occurrence of key words related to component failure? Is there a sudden spike in the keyword ‘OIL SPILLAGE’ , 'FUEL PUMP', 'COOLANT' , 'BRAKE LINING', 'SHORT CIRCUIT', 'FLIMSY' during the winter season?
4.Is there a strong association between the keyword frequency and a components rank in terms of warranty cost?
5.Which supplier part frequently comes up in technician’s comments regarding faulty products observed while servicing a car in its warranty period?

This forms the first step in a journey to completely extract 'signal patterns' buried in years of interactions a technician has with servicing a vehicle. This experential knowledge can be chunked and built upon to further reduce warranty cost which is a huge sword hanging over most Automotive organisations