March 28, 2012

Hadoop Cluster to mine Telecom machine data

So how can terabytes of telecom device/event data be managed and mined ?
Telecom infrastructure captures log events that describe the behavior of thousands of devices within its asset intensive infrastructure - firewalls, towers, switches, servers etc. Each of these devices emit logs and alarm events in the log describe the health and activity of these devices.  Understanding embedded in core telecom machine data is key. Hadoop can store and analyze log data, and builds a higher-level picture of the health of the data center as a whole.

Lets consider a real life use case where we need to identify the sequence of events preceding an adverse events ?

- Multi terabyte event logs
- Millions of atomic event
- A few hundred adverse events

Algorithms to decode linkages between upstream events and downstream adverse event.  An apriori algorithm can be executed on a Hadoop node to surface those sequence of events which seem to co-related to adverse events. This algorithm needs to traverse moving time windows of event logs to discover the most important sequences which are statistically significant from an adverse event point of view.

1 comment:

  1. Your post is really helpful for me.Thanks for your wonderful post. I am very happy to read your post. It is really very helpful for us and I have gathered some important information from this Hadoop tutorials.
    Hadoop Training in hyderabad