Apply many TIMi predictive models on each row of the input table.
This operator applies many TIMi predictive models on each row of the input table.
Let’s assume that we developed a simple churn model (i.e. we have a Binary Classification problem or, in other words, a scoring/ranking). We want to use this churn model inside Anatella. We’ll have:
Let’s now assume that you are developing a “next-to-buy” solution: i.e. You must guess which product out of these 5 products: TEDDY BEAR, LIGHT SABER, IPOD, FLOWER, BEER your customer will buy next. In the scientific world, this is known as a Multi-Class prediction system (in opposition to the classical Binary-Class prediction system: Did the customer churned or not? Yes/No).
To do multi-class prediction with TIMi, you must first transform the multi-class prediction problem into a series of binary classification tasks. These binary classifiers can be of one of the two types:
•1 vs all others
•1 vs 1
To know more about this subject, see section 6 of the “TIMi Advanced Guide”.
Let’s first investigate the simplest case: “1 vs all others”. In such a case, to make the prediction you need only 5 different binary predictive models (one for each product). You will apply these 5 models on each of the customers to obtain 5 purchase-probabilities (these 5 probabilities are different for all customers: This is true “one-to-one” marketing). The product with the highest purchase-probability will be the one that you will recommend as the next-to-buy. If you need to give a suggestion about 2 different products, you’ll take the 2 products with the 2 highest probabilities.
It can happen that the quantity of each different product is limited. In such a case, you cannot simply suggest the products with the highest purchase-probabilities (and totally disregard the limitation on the quantity of each product). When such a constraint (on the quantity of each product) exists, the assignment of a specific product to a specific customer must be computed using a more complex mechanism: see section 5.15.1 about the “assignmentSolver”.
Let’s return to a very simple case: More precisely:
•There are no constraints on the quantity of each of the products.
•We are using the “1 vs all others” type of predictive models.
For the “1 vs 1” type of predictive models, we’ll have:
By default, when there are several models “voting” for the same class, the final (purchase) probability is the mean of the individual probabilities of each model. For example: The probability of buying a “teddy bear” is the mean of the 4 probabilities given by the 4 models:
The “mean” operator is the default operator that is used to aggregate the individual probabilities of each predictive model to obtain the “final” probability for each different class (LIGHTSABER, IPOD, FLOWER, BEER). You can also use a different operator:
Even for simple Binary Classifier, it’s sometime interesting to use several predictive models (instead of just one). Depending on the way these models are created, this approach is either named “bagging”, “boosting” or “ensemble learning”. Why use several models instead of one? Because it usually delivers higher predictive accuracy.
To do “bagging”, “boosting” or “ensemble learning” with Anatella, use the following settings:
Once again, the final churn probability is the mean (or the median, depending on the settings) of the individual probabilities of each of the 7 models.
Finally, it’s quite common to create different predictive models for the different segments of your population. For example: it can happen that the purchasing behavior of your customers is totally different depending if the customer resides in an urban zone or not. In such a case, we’ll have 2 sets of models (“urban” and “non-urban”). For example:
The advanced parameter tab contains some more parameters:
Parameter 1: If this parameter is non-empty, a column containing the segment name (e.g. “urban_location” or “non_urban_location”) is added to the result table.
Parameter 2: Name of the column containing the final prediction:
•For a binary classification problem, this column contains the ‘final” probability.
•For a continuous prediction problem, this column contains the ‘final” prediction.
•For a Multi-Class classification problem, this column contains the most-likely class (e.g. “TEDDYBEAR”, “LIGTHSABER”, “IPOD”, “FLOWER”, “BEER”).
Parameter 3: The operator user to aggregate into one “final” number the individual probabilities given by each predictive model (to obtain the final probability to belong to a specific class).
Parameter 4: Self-explanatory
Parameter 5: When the parameter “output prediction details” is checked, we’ll obtain:
•For a binary or continuous predictive model: Each of the individual probabilities given by each of the predictive models.
•For a multi-class predictive model: The different probabilities to belong to each of the different classes. When using the “1 vs 1” type of predictive models or when using “bagging”, “boosting” or “ensemble learning”, these probabilities are already “aggregated” values of different individual probabilities.
Parameter 6: Prefix of the name of the columns containing the “details”. You can change the prefix to avoid any column name collision.
Parameter 7: When this option is activated, Anatella computes some corrections on the probabilities computed by the different predictive models to account for the fact that the density of (binary) targets inside the training set(s) is different from the density of targets inside the apply/scoring set. This correction is named, in technical terms, “apriori correction”. To use this option, you must give on the second input pin of this Action, a table that contains the expected density of targets inside the apply/scoring dataset (i.e. the new apriori probabilities) for each different classes&segments. Here is an example:
In the above example we corrected the apriori probabilities so that all items (teddy bear, light saber, ipod, flower, beer) have equal apriori’s (i.e. 20%) whatever the percentage of targets that was actually observed in the learning dataset(s) used to create the various predictive models.
For even faster deployment of your scoring/models, the TIMiUseModels Action can run inside a N-Way multithreaded section (see section 5.3.2. about multithreading).