Abstract
In predictive data mining, algorithms will be both optimized and compared using a measure of predictive performance. Different measures will yield different results, and it follows that it is crucial to match the measure to the true objectives. In this paper, we explore the desirable characteristics of measures for constructing and evaluating tools for mining plastic card data to detect fraud. We define two measures, one based on minimizing the overall cost to the card company, and the other based on minimizing the amount of fraud given the maximum number of investigations the card company can afford to make. We also describe a plot, analogous to the standard ROC, for displaying the performance trace of an algorithm as the relative costs of the two different kinds of misclassification—classing a fraudulent transaction as legitimate or vice versa—are varied.
Acknowledgements
The work of Piotr Juszczak and Dave Weston described here was supported by the EPSRC under grant number EP/C532589/1: ThinkCrime: Statistical and machine learning tools for plastic card and other personal fraud detection. The work of Chris Whitrow was supported by a grant from the Institute for Mathematical Sciences. The work of David Hand was partially supported by a Royal Society Wolfson Research Merit Award. We would like to express our appreciation to Abbey Plc for supporting the ThinkCrime project by supplying the data used in the illustration in . We are grateful to the anonymous referees for their constructive comments on an earlier draft of this paper.