5.8. Graph Mining

<< Click to Display Table of Contents >>

Navigation:  5. Detailed description of the Actions >

5.8. Graph Mining

 

The “Graph Mining Module” included inside Anatella perform Social Network Analysis (S.N.A.).

 

SNA techniques can be applied:
 

…in the “telecommunication world”, where each arc is a communications between two individuals (i.e. between two phone numbers).

…in the “banking world” to analyse graphs where each node represents a bank account and where:

o…each arc is a money transfer between two bank accounts.

o…an arc between the nodes i and j represents the fact that the nodes i and j are:

…spending their money in the same shops (on the same hours of the day).

…going to the same locations (on the same hours of the day).

…to “text mining” where each node represents a word and where the weight ANATEL~3_img685 of the arc between the (nodes/words) i and j is the number of times that the (nodes/words) i and j are cited in the same document. The detected communities are the different “topics” of the text corpus.

…in any field of commerce or science where a graph exists.

 

Using SNA techniques, you can extract valuable metrics out of your graphs.
 

These SNA metrics can be used for:
 

Advanced Segmentation of your customer base

Improved targeting

omainly for churn prediction models.

o…also for cross-sell and up-sell models.

Detection of Influencers for:

oviral marketing,

orefer-a-friend campaigns,

onew customer acquisition

Optimization of the service level depending on the value of the subscriber (i.e. more valuable&important subscribers receive better services).

Next generation recommendation engine.

 
All modern SNA algorithms are very efficient and can process extremely large social networks (composed of several dozen millions of nodes) in less than an hour. Thus, during a typical SNA analysis session, most of the processing time is this spent in the computation of the arc weights (and not in the actual SNA algorithms).

 

We strongly advise you to use an efficient data manipulation tool to compute all the arc weights. Indeed, when performing SNA analysis, you cannot use any sampling technique to reduce the computation time (because sampling will remove some arcs and it will completely destroy the original nature of the graph). Since sampling is forbidden, you need an efficient (i.e. fast) tool to be able to manipulate your graph in a reasonable amount of time. This is why we strongly encourage you to use Anatella to prepare your graphs, otherwise you might end-up “blocked” by the long CPU-time required to create your graphs.

 

To use efficiently the computed SNA metrics, we suggest that you use the Award-winning predictive engine “TIMi Modeler” included inside the “TIMi Suite”.

 

In the “Telecom World”, we created a solution named “LinkAlytics”. “LinkAlytics” makes extensive use of the SNA modules included inside Anatella. “LinkAlytics” is a set of Anatella-data-transformation-graphs that uses SNA algorithms (i.e. graph mining) to create very accurate predictive models for churn, cross-sell & up-sell problems for Telecoms. “LinkAlytics” starts from raw (binary) CDR logs, create a 360° customer view (including around one thousand of SNA-based variables) and finally create many predictive models. Everything is automated and runs in a few hours every day.

 

The Anatella-Graph-Mining module is composed of:

All the Community Detection Algorithms: included in the Action: clip0229

The Advanced Social Leader Detection Algorithm: included in the Action: clip0229

The Louvain Social Leader Detection Algorithm: included in the Action: clip0230

The Core Number of each node: included in the Action: clip0230

All the algorithms to detect the Business Leaders: included in the Action: clip0231

An algorithm to compute the significance of each arc: included in the Action: clip0232

An algorithm that simulates the propagation of a disease inside a network: ANATEL~3_img692

 

 

There are, basically, 6 metrics that can be extracted:
 

Social Communities

Social Leaders (“In-Betweeness” centrality, Louvain Leaders, Core Numbers).

Influencers.

Business Leader.

Arc significance.

Disease propagation.

How to code you own SNA algorithms in C++.

 

The graph mining module inside Anatella has its own, separate documentation: See the document named “Anatella_GraphMining.pdf”.