<< Click to Display Table of Contents >> Navigation: 5. Detailed description of the Actions > 5.7. Data Mining > 5.7.5. Transaction-level to Customer-level Aggregation |
Icon:
Function: CustomerAggregate
Property window:
Short description:
Transaction-level to customer-level aggregation. This Action transforms the Transaction-Table (containing all the transactions made by the customers - one transaction per row) into a Customer-Table (each row of the Customer-Table consolidates all the information pertaining to one customer).
Long Description:
Pre-requisite
The input table of this action must contain all the transactions sorted first by customers, and secondly by time.
In most commercial systems, you have inside the operational system the following 3 tables:
1.The Transaction or Event-Table
The Transaction-Table contains all the transactions that have occured. Each row of the "Transaction-Table" represents a transaction. For each transaction, you usually know, at least, the following (these are the columns of the transaction-table):
-ID_Customer (the primary key of the customer)
-ID_Product (the primary key of the product that has been purchased)
-Date_of_transaction
See the next pages for an example of Transaction Table.
2.The Product-Table
Usually, all the available products are placed in different categories. These categories are given inside the "Product-table". Each row of the "Product-table" contains many informations on a different products and, in particular, their category.
3.The Customer-Table
The "Customer-Table" contains many information about a customer: its age, postal address (this is required for the delivery of the purchased items), sex,etc. Usually, the "Customer-table" does not contain any information about the products that a specific customer has bought.
The objective of this script is to aggregate all the information contained into these 3 tables into one (possibly very large) Global-Customer-Level table.
This Global-table can thereafter be used to:
1.to create a good segmentation of your customer base
2.to create (with TIM!) powerful predictive models for cross-selling purposes.
Each row of the final Global-Table represents one customer. The columns of the Global-Table contains all the informations that we know about the customers. In particular, there are some columns very important columns that describe the "purchase habit" of the customers. For example: “How many purchases are there in a specific category over the last x months?”
To summarize: This script allows:
•to extract the "purchase habit" of each customer out of the Transaction-Table
•to put the result into a "Customer-Level" Table.
Basically, it transforms a "Transaction-level" table into a "Customer-level" table.
As you can see, Anatella allows you to easily program these types of transformation. Other ETL tools are not as flexible since they do not offer any transformation-operators that allow you to perform easily such (very common) data-transformation. It's usually always an incredibly painful nightmare to try to do the same job into any other ETL tool. But fortunately, now, Anatella is there to save the day!
Let’s take a first explanatory example.
Here is a typical Anatella-Graph using the “ Action”:
An example Customer-Table looks like this:
An example Transaction-Table looks like this:
An example Product-Table looks like this:
Here is a view of the “Enriched” Transaction-Table:
The next “Replace String ” Action creates a small number of “big” meta-categories based on the initial product-categories. The category-regroupments are defined inside the “Inline-Table” connect on pin 2 of the “Replace String ” Action. This “Inline-Table” is:
Thereafter the “ Action” transforms the “Enriched Transaction-table” into a “Customer-Level” table directly usable for predictive & segmentation analytics. The parameters of the “ Action” are:
Anatella will generate as output a “Customer-Level” table containing a set of new columns. These columns are:
A.RFM columns:
Let's set the observation date as "o".
For each “product_category”, Anatella will create these variables:
•Recency:
onumber of months since last purchase
•Frequency: For each time period (x,y):
onumber of transactions in [x o]
o(average number of transactions in [x o]) - (average number of transactions in [y x])
o(average number of transactions in [x o]) / (1+(average number of transactions in [y x]))
onumber of purchased items in [x o]
o(average number of purchased items in [x o]) - (average number of purchased items in [y x])
o(average number of purchased items in [x o]) / (1+(average number of purchased items in [y x]))
•Monetary: For each time period (x,y):
osum of purchase values in [x o]
o(average purchase values for transactions in [x o]) - (average purchase values for transactions in [y x])
o(average purchase values for transactions in [x o]) / (1+(average purchase values for transactions in [y x]))
The same columns are also computed ignoring the “product_category”.
B.Target Columns:
The “Target Columns” that are generated by Anatella are:
•For each “product_category”:
oIs there a purchase after the observation date? yes/no
oNumber of transactions after the observation date?
oSum of purchases values after the observation date?
The same targets are also computed ignoring the “product_category”.
Inside our example, the observation date "o" is “5/2006” and the Time-Periods (x,y) are:
{ (-1,-6) ; (-2,-8) ; (-2,-12) }
This means that the following columns are created as output of the “ Action”:
CustomerID |
Customer.Title |
Customer.Marital_Status |
Customer.City |
Customer.Zip Code |
Customer.Age |
Recency last purchase |
nItem in [-1 observ.] |
ntransactions in [-1 observ.] |
sumTransactionsValues in [-1 observ.] |
(Avg nItems in [-1 observ.]) - (Avg nItems in [-6 -1]) |
(Avg nItems in [-1 observ.]) / (1+Avg nItems in [-6 -1]) |
(Avg nTransactions in [-1 observ.]) - (Avg nTransactions in [-6 -1]) |
(Avg nTransactions in [-1 observ.]) / (1+Avg nTransactions in [-6 -1]) |
(Avg transactionsValue in [-1 observ.]) - (Avg transactionsValue in [-6 -1]) |
(Avg transactionsValue in [-1 observ.]) / (1+Avg transactionsValue in [-6 -1]) |
nItem in [-2 observ.] |
ntransactions in [-2 observ.] |
sumTransactionsValues in [-2 observ.] |
(Avg nItems in [-2 observ.]) - (Avg nItems in [-8 -2]) |
(Avg nItems in [-2 observ.]) / (1+Avg nItems in [-8 -2]) |
(Avg nTransactions in [-2 observ.]) - (Avg nTransactions in [-8 -2]) |
(Avg nTransactions in [-2 observ.]) / (1+Avg nTransactions in [-8 -2]) |
(Avg transactionsValue in [-2 observ.]) - (Avg transactionsValue in [-8 -2]) |
(Avg transactionsValue in [-2 observ.]) / (1+Avg transactionsValue in [-8 -2]) |
(Avg nItems in [-2 observ.]) - (Avg nItems in [-12 -2]) |
(Avg nItems in [-2 observ.]) / (1+Avg nItems in [-12 -2]) |
(Avg nTransactions in [-2 observ.]) - (Avg nTransactions in [-12 -2]) |
(Avg nTransactions in [-2 observ.]) / (1+Avg nTransactions in [-12 -2]) |
(Avg transactionsValue in [-2 observ.]) - (Avg transactionsValue in [-12 -2]) |
(Avg transactionsValue in [-2 observ.]) / (1+Avg transactionsValue in [-12 -2]) |
BTarget: Is Lapsed customer(churner) |
BTarget: Is Recurrent customer |
CTarget: number of purchased items after obs.date |
CTarget: sum of transactions values |
Recency last purchase in category Audio Book |
nItem in [-1 observ.] in category Audio Book |
ntransactions in [-1 observ.] in category Audio Book |
sumTransactionsValues in [-1 observ.] in category Audio Book |
(Avg nItems in [-1 observ.]) - (Avg nItems in [-6 -1]) in category Audio Book |
(Avg nItems in [-1 observ.]) / (1+Avg nItems in [-6 -1]) in category Audio Book |
(Avg nTransactions in [-1 observ.]) - (Avg nTransactions in [-6 -1]) in category Audio Book |
(Avg nTransactions in [-1 observ.]) / (1+Avg nTransactions in [-6 -1]) in category Audio Book |
(Avg transactionsValue in [-1 observ.]) - (Avg transactionsValue in [-6 -1]) in category Audio Book |
(Avg transactionsValue in [-1 observ.]) / (1+Avg transactionsValue in [-6 -1]) in category Audio Book |
nItem in [-2 observ.] in category Audio Book |
ntransactions in [-2 observ.] in category Audio Book |
sumTransactionsValues in [-2 observ.] in category Audio Book |
(Avg nItems in [-2 observ.]) - (Avg nItems in [-8 -2]) in category Audio Book |
(Avg nItems in [-2 observ.]) / (1+Avg nItems in [-8 -2]) in category Audio Book |
(Avg nTransactions in [-2 observ.]) - (Avg nTransactions in [-8 -2]) in category Audio Book |
(Avg nTransactions in [-2 observ.]) / (1+Avg nTransactions in [-8 -2]) in category Audio Book |
(Avg transactionsValue in [-2 observ.]) - (Avg transactionsValue in [-8 -2]) in category Audio Book |
(Avg transactionsValue in [-2 observ.]) / (1+Avg transactionsValue in [-8 -2]) in category Audio Book |
(Avg nItems in [-2 observ.]) - (Avg nItems in [-12 -2]) in category Audio Book |
(Avg nItems in [-2 observ.]) / (1+Avg nItems in [-12 -2]) in category Audio Book |
(Avg nTransactions in [-2 observ.]) - (Avg nTransactions in [-12 -2]) in category Audio Book |
(Avg nTransactions in [-2 observ.]) / (1+Avg nTransactions in [-12 -2]) in category Audio Book |
(Avg transactionsValue in [-2 observ.]) - (Avg transactionsValue in [-12 -2]) in category Audio Book |
(Avg transactionsValue in [-2 observ.]) / (1+Avg transactionsValue in [-12 -2]) in category Audio Book |
BTarget: Is Lapsed customer(churner) in category Audio Book |
BTarget: Is Recurrent customer in category Audio Book |
CTarget: number of purchased items after obs.date in category Audio Book |
CTarget: sum of transactions values in category Audio Book |
Recency last purchase in category Pop |
nItem in [-1 observ.] in category Pop |
ntransactions in [-1 observ.] in category Pop |
sumTransactionsValues in [-1 observ.] in category Pop |
(Avg nItems in [-1 observ.]) - (Avg nItems in [-6 -1]) in category Pop |
(Avg nItems in [-1 observ.]) / (1+Avg nItems in [-6 -1]) in category Pop |
(Avg nTransactions in [-1 observ.]) - (Avg nTransactions in [-6 -1]) in category Pop |
(Avg nTransactions in [-1 observ.]) / (1+Avg nTransactions in [-6 -1]) in category Pop |
(Avg transactionsValue in [-1 observ.]) - (Avg transactionsValue in [-6 -1]) in category Pop |
(Avg transactionsValue in [-1 observ.]) / (1+Avg transactionsValue in [-6 -1]) in category Pop |
nItem in [-2 observ.] in category Pop |
ntransactions in [-2 observ.] in category Pop |
sumTransactionsValues in [-2 observ.] in category Pop |
(Avg nItems in [-2 observ.]) - (Avg nItems in [-8 -2]) in category Pop |
(Avg nItems in [-2 observ.]) / (1+Avg nItems in [-8 -2]) in category Pop |
(Avg nTransactions in [-2 observ.]) - (Avg nTransactions in [-8 -2]) in category Pop |
(Avg nTransactions in [-2 observ.]) / (1+Avg nTransactions in [-8 -2]) in category Pop |
(Avg transactionsValue in [-2 observ.]) - (Avg transactionsValue in [-8 -2]) in category Pop |
(Avg transactionsValue in [-2 observ.]) / (1+Avg transactionsValue in [-8 -2]) in category Pop |
(Avg nItems in [-2 observ.]) - (Avg nItems in [-12 -2]) in category Pop |
(Avg nItems in [-2 observ.]) / (1+Avg nItems in [-12 -2]) in category Pop |
(Avg nTransactions in [-2 observ.]) - (Avg nTransactions in [-12 -2]) in category Pop |
(Avg nTransactions in [-2 observ.]) / (1+Avg nTransactions in [-12 -2]) in category Pop |
(Avg transactionsValue in [-2 observ.]) - (Avg transactionsValue in [-12 -2]) in category Pop |
(Avg transactionsValue in [-2 observ.]) / (1+Avg transactionsValue in [-12 -2]) in category Pop |
BTarget: Is Lapsed customer(churner) in category Pop |
BTarget: Is Recurrent customer in category Pop |
CTarget: number of purchased items after obs.date in category Pop |
CTarget: sum of transactions values in category Pop |
Recency last purchase in category Rock'N'Roll |
nItem in [-1 observ.] in category Rock'N'Roll |
ntransactions in [-1 observ.] in category Rock'N'Roll |
sumTransactionsValues in [-1 observ.] in category Rock'N'Roll |
(Avg nItems in [-1 observ.]) - (Avg nItems in [-6 -1]) in category Rock'N'Roll |
(Avg nItems in [-1 observ.]) / (1+Avg nItems in [-6 -1]) in category Rock'N'Roll |
(Avg nTransactions in [-1 observ.]) - (Avg nTransactions in [-6 -1]) in category Rock'N'Roll |
(Avg nTransactions in [-1 observ.]) / (1+Avg nTransactions in [-6 -1]) in category Rock'N'Roll |
(Avg transactionsValue in [-1 observ.]) - (Avg transactionsValue in [-6 -1]) in category Rock'N'Roll |
(Avg transactionsValue in [-1 observ.]) / (1+Avg transactionsValue in [-6 -1]) in category Rock'N'Roll |
nItem in [-2 observ.] in category Rock'N'Roll |
ntransactions in [-2 observ.] in category Rock'N'Roll |
sumTransactionsValues in [-2 observ.] in category Rock'N'Roll |
(Avg nItems in [-2 observ.]) - (Avg nItems in [-8 -2]) in category Rock'N'Roll |
(Avg nItems in [-2 observ.]) / (1+Avg nItems in [-8 -2]) in category Rock'N'Roll |
(Avg nTransactions in [-2 observ.]) - (Avg nTransactions in [-8 -2]) in category Rock'N'Roll |
(Avg nTransactions in [-2 observ.]) / (1+Avg nTransactions in [-8 -2]) in category Rock'N'Roll |
(Avg transactionsValue in [-2 observ.]) - (Avg transactionsValue in [-8 -2]) in category Rock'N'Roll |
(Avg transactionsValue in [-2 observ.]) / (1+Avg transactionsValue in [-8 -2]) in category Rock'N'Roll |
(Avg nItems in [-2 observ.]) - (Avg nItems in [-12 -2]) in category Rock'N'Roll |
(Avg nItems in [-2 observ.]) / (1+Avg nItems in [-12 -2]) in category Rock'N'Roll |
(Avg nTransactions in [-2 observ.]) - (Avg nTransactions in [-12 -2]) in category Rock'N'Roll |
(Avg nTransactions in [-2 observ.]) / (1+Avg nTransactions in [-12 -2]) in category Rock'N'Roll |
(Avg transactionsValue in [-2 observ.]) - (Avg transactionsValue in [-12 -2]) in category Rock'N'Roll |
(Avg transactionsValue in [-2 observ.]) / (1+Avg transactionsValue in [-12 -2]) in category Rock'N'Roll |
BTarget: Is Lapsed customer(churner) in category Rock'N'Roll |
BTarget: Is Recurrent customer in category Rock'N'Roll |
CTarget: number of purchased items after obs.date in category Rock'N'Roll |
CTarget: sum of transactions values in category Rock'N'Roll |