Identifies outliers in a dataset.
Identifies outliers in a dataset using the Mahalanobis distance projected on a Chi-squared distribution.
The Mahalanobis distance is an absolute number starting at 0 at the center of the multivariate distribution, and the distance is weighted by the covariance matrix in order to include the density into the equation. The largest the distance, the most likely a point is an outlier.
The Chi-Squared test gives a statistical threshold to flag outliers. On sample, the value is typically 0.9999 (we reject outliers if they have less than 0.01% probability of belonging to the multivariate distribution).