Abstract
This paper develops a method for handling two-class classification problems with highly unbalanced class sizes and misclassification costs. When the class sizes are highly unbalanced and the minority class represents a rare event, conventional classification methods tend to strongly favour the majority class, resulting in very low detection of the minority class. A method is proposed to determine the optimal cut-off for asymmetric misclassification costs and for unbalanced class sizes. Monte Carlo simulations show that this proposal performs better than the method based on the notion of classification accuracy. Finally, the proposed method is applied to empirical data on Italian small and medium enterprises to classify them into default and non-default groups.
Notes
1. In the following section of the empirical evidence, the credit lending to Italian SMEs is analysed. The default percentage for Italian SMEs is 5 [Citation12].
2. In order to generate these sets of data the R package ‘sn’ is used.
3. In , the values are rounded off to the integer values.