Abstract
Nowadays we are witnessing a transformation of the business processes towards a more computation driven approach. The ever increasing usage of Machine Learning techniques is the clearest example of such trend. This sort of revolution is often providing advantages, such as an increase in prediction accuracy and a reduced time to obtain the results. However, these methods present a major drawback: it is very difficult to understand on what grounds the algorithm took the decision. To address this issue we consider the LIME method. We give a general background on LIME then, we focus on the stability issue: employing the method repeated times, under the same conditions, may yield to different explanations. Two complementary indices are proposed, to measure LIME stability. It is important for the practitioner to be aware of the issue, as well as to have a tool for spotting it. Stability guarantees LIME explanations to be reliable therefore a stability assessment, made through the proposed indices, is crucial. As a case study, we apply both Machine Learning and classical statistical techniques to Credit Risk data. We test LIME on the Machine Learning algorithm and check its stability. Eventually, we examine the goodness of the explanations returned.
Acknowledgements
We would like to thank Professor Giuliano Galimberti who provided insight and expertise that greatly assisted the research, although he may not agree with all of the interpretations/conclusions of this paper.
Disclosure statement
No potential conflict of interest was reported by the authors.
Notes
1 We do not state all the derivations of the expectation and variance formulae, for the sake of readability.
2 In this formulation, we consider as the errors of the Linear Regression model. In other words, the Weighted Ridge
estimator is the same of Weighted Regression.
We may not use the errors of any Ridge model to calculate an unbiased estimator of the error variance because Ridge regularisation term decreases the variance. Using such errors would cause the estimator to be biased towards 0.