TEXT MINING
The European commission is funding thousands of research projects each year in collaboration with universities and private companies. They needed a classification system that looks at each contract and detect the type of the funding.
8 weeks
It took 2 days to build the 120 models needed. The rest of the time was used to automate the processes and check the conflicting contracts with the lawyers. The whole project was complete and automated in 8 weeks. We didn’t use any dictionary to reduce the number of columns in the dataset. We simply injected the raw dataset inside TIMi.
96% accuracy
The final accuracy of the prediction system was 96%. When there was a difference between the classification performed by the predictive models and by the experts, we double-checked the contract. Sometime a contract had to be reviewed by 8 experts before having a final answer. We finally noticed that around 30% of the contracts where incorrectly classified by the humans.