Risk management – Probability of default

The accuracy of “Probability of default” predictive model is of high importance because a good predictive model in this domain can make tremendous differences in terms of ROI.

Such predictive models are sometime called PD models (“probability of default” models). See more information at this subject at wikipedia.

For example, in Brazil, some people try to obtain a credit card under a false name and then the same people spend the whole amount of the credit (which is usually around 1000 euros) without any intention of paying back the expenses! In such situation, the bank just loses 1000 euros.

If you have a very slightly more accurate predictive model that manages to correctly identify an additional 0.1% of “bad payers” (this is nearly nothing) in a bank of 1 million customers, than you just gain 0.001 x 1e6 x 1000= 1 million euros. This is a very substantial ROI! Predictive model accuracy is the most crucial factor here. For such application, TIMi is the most accurate solution, as demonstrated by our outstanding results at the PAKDD2010: http://sede.neurotech.com.br/PAKDD2010/result.do?method=load (see the 2 entries “Kranf” and “TZTeam”: both team used the TIMi as their only datamining engine).

The objective of the 14th PAKDD 2010 (Pacific-Asia Knowledge Discovery and Data Mining conference 2010) was the Re-Calibration of a Credit Risk Assessment System Based on Biased Data (this is a very common need for industrial users such as banks). This competition focuses on the credit scoring model’s generalization capacity from partial biased data sets available for modeling.

The PAKKD, KDD and ECML datamining competitions are the most famous datamining competitions in the world. These competitions are organized each year by highly qualified University teams of researchers that are really “vendor-neutral”.

The “probability of default” predictive models are of very high importance for a banks and the international PAKDD2010 competition clearly demonstrates that the TIMi the best solution for that.

The ancestor of the TIMi (that had a lower accuracy) was also used to create “Probability of Default model” for a famous Belgian Bank. The objective here was to develop a predictive model that guesses if a “small business” will go bankrupt in a 3 months-time-window. This was very easy. In Belgium, all companies must give their balance sheet to the state at the end of the year. The training dataset was created based on all the numbers extracted from all the balance sheets of all the Belgian companies. These balance-sheet-data are publicly available (at least in Belgium) and are not very expensive (at least not for a bank). The accuracy of the predictive model was 96%.

This predictive model is now used on a daily-basis at the bank to check if a company can receive a loan or not. This specific predictive model lead to some tremendous reaction from some CEO’s that were asking for a loan: when their loan-application was rejected and when they learned that there is 96% that their company won’t exist anymore in less than 3 month, some of them have very unexpected behavior!