560
Views
108
CrossRef citations to date
0
Altmetric
General Paper

On the suitability of resampling techniques for the class imbalance problem in credit scoring

, &
Pages 1060-1070 | Received 01 Nov 2011, Accepted 01 Aug 2012, Published online: 21 Dec 2017

References

  • AbdouHAPointonJCredit scoring, statistical techniques and evaluation criteria: A review of the literatureIntelligent Systems in Accounting, Finance & Management2011182–3598810.1002/isaf.325
  • AbrahamsCRZhangMFair Lending Compliance: Intelligence and Implications for Credit Risk Management2008
  • Alcalá-FdezJKEEL: A software tool to assess evolutionary algorithms for data mining problemsSoft Computing200913330731810.1007/s00500-008-0323-y
  • BaesensBvan GestelTViaeneSStepanovaMSuykensJVanthienenJBenchmarking state-of-the-art classification algorithms for credit scoringJournal of the Operational Research Society200354662763510.1057/palgrave.jors.2601545
  • BaesensBMuesCMartensDVanthienenJ50 years of data mining and OR: Upcoming trends and challengesJournal of the Operational Research Society200960S181682310.1057/jors.2008.171
  • BarandelaRSánchezJSGarcíaVRangelEStrategies for learning in class imbalance problemsPattern Recognition200336384985110.1016/S0031-3203(02)00257-1
  • BatistaGEAPAPratiRCMonardMCA study of the behavior of several methods for balancing machine learning training dataSIGKDD Explorations Newsletter200461202910.1145/1007730.1007735
  • BhattacharyyaSJhaSTharakunnelKWestlandJCData mining for credit card fraud: A comparative studyDecision Support Systems201150360261310.1016/j.dss.2010.08.008
  • BrownIMuesCAn experimental comparison of classification algorithms for imbalanced credit scoring data setsExpert Systems with Applications20123933446345310.1016/j.eswa.2011.09.033
  • Bunkhumpornpat C, Sinapiromsaran K and Lursinsap C (2009). Safe-level-SMOTE: Safe-level-synthetic minority over-sampling technique for handling the class imbalanced problem. In: Proceedings of the 13th Pacific Asia Conference on Knowledge Discovery and Data Mining, Bangkok, Thailand, pp 475–482.
  • ChawlaNVBowyerKWHallLOKegelmeyerWPSMOTE: Synthetic minority over-sampling techniqueJournal of Artificial Intelligence Research200216321357
  • ChawlaNVJapkowiczNKotczAEditorial: Special issue on learning from imbalanced data setsSIGKDD Explorations Newsletter2004611610.1145/1007730.1007733
  • ChawlaNVCieslakDAHallLOJoshiAAutomatically countering imbalance and its empirical relationship to costData Mining and Knowledge Discovery200817222525210.1007/s10618-008-0087-0
  • DemšarJStatistical comparisons of classifiers over multiple data setsJournal of Machine Learning Research200671130
  • Florez-Lopez R (2010). Credit risk management for low default portfolios. Forecasting defaults through cooperative models and boostrapping strategies. In: Proceedings of the 4th European Risk Conference—Perspectives in Risk Management: Accounting, Governance and Internal Control, Nottingham, UK, pp 1–27.
  • GarcíaSFernándezALuengoJHerreraFAdvanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of powerInformation Sciences2010180102044206410.1016/j.ins.2009.12.010
  • HandDJGood practice in retail credit scorecard assessmentJournal of the Operational Research Society20055691109111710.1057/palgrave.jors.2601932
  • HandDJVinciottiVChoosing k for two-class nearest neighbour classifiers with unbalanced classesPattern Recognition Letters2003249–101555156210.1016/S0167-8655(02)00394-X
  • HartPEThe condensed nearest neighbor ruleIEEE Transactions on Information Theory196814350551610.1109/TIT.1968.1054155
  • HeHGarciaEALearning from imbalanced dataIEEE Transactions on Knowledge and Data Engineering20092191263128410.1109/TKDE.2008.239
  • HenleyWEHandDJConstruction of a k-nearest-neighbour credit-scoring systemIMA Journal of Management Mathematics19978430532110.1093/imaman/8.4.305
  • HochbergYTamhaneACMultiple Comparison Procedures1987
  • HuangY-MHungC-MJiauHCEvaluation of neural networks and data mining methods on a credit assessment task for class imbalance problemNonlinear Analysis: Real World Applications20067472074710.1016/j.nonrwa.2005.04.006
  • HuangZChenHHsuC-JChenW-HWuSCredit rating analysis with support vector machines and neural networks: A market comparative studyDecision Support Systems200437454355810.1016/S0167-9236(03)00086-1
  • JapkowiczNStephenSThe class imbalance problem: A systematic studyIntelligent Data Analysis200265429449
  • Kennedy K, Mac Namee B and Delany SJ (2010). Learning without default: A study of one-class classification and the low-default portfolio problem. In: Proceedings of the 20th Irish Conference on Artificial Intelligence and Cognitive Science, Dublin, Ireland, pp 174–187.
  • Kubat M and Matwin S (1997). Addressing the curse of imbalanced training sets: One-sided selection. In: Proceedings of the 14th International Conference on Machine Learning, Nashville, TN, pp 179–186.
  • Laurikkala J (2001). Improving identification of difficult small classes by balancing class distribution. In: Proceedings of the 8th Conference on Artificial Intelligence in Medicine in Europe, Cascais, Portugal, pp 63–66.
  • LessmannSBaesensBMuesCPietschSBenchmarking classification models for software defect prediction: A proposed framework and novel findingsIEEE Transactions on Software Engineering200834448549610.1109/TSE.2008.35
  • Maciejewski T and Stefanowski J (2011). Local neighbourhood extension of SMOTE for mining imbalanced data. In: Proceedings of the IEEE Symposium on Computational Intelligence and Data Mining, Paris, France, pp 104–111.
  • PlutoKTascheDEstimating probabilities of default for low default portfoliosThe Basel II Risk Parameters: Estimation, Validation, and Stress Testing200675101
  • Provost F and Fawcett T (1997). Analysis and visualization of classifier performance: Comparison under imprecise class and cost distributions. In: Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining, Newport Beach, CA, pp 43–48.
  • Sabzevari H, Soleymani M and Noorbakhsh E (2007). A comparison between statistical and data mining methods for credit scoring in case of limited available data. In: Proceedings of the 3rd CRC Credit Scoring Conference, Edinburgh, UK.
  • SokolovaMLapalmeGA systematic analysis of performance measures for classification tasksInformation Processing & Management200945442743710.1016/j.ipm.2009.03.002
  • ThomasLCEdelmanDBCrookJNCredit Scoring and Its Applications2002
  • Tian B, Nan L, Zheng Q and Yang L (2010). Customer credit scoring method based on the SVDD classification model with imbalanced dataset. In: Proccedings of the International Conference on E-business Technology and Strategy, Ottawa, Canada, pp 46–60.
  • TomekITwo modifications of CNNIEEE Transactions on Systems, Man and Cybernetics197661176977210.1109/TSMC.1976.4309452
  • VinciottiVHandDJScorecard construction with unbalanced class sizesJournal of the Iranian Statistical Society200322189205
  • WangGHaoJMaJJiangHA comparative assessment of ensemble learning for credit scoringExpert Systems with Applications201138122323010.1016/j.eswa.2010.06.048
  • WilsonDLAsymptotic properties of nearest neighbour rules using edited dataIEEE Transactions on Systems, Man and Cybernetics19722340842110.1109/TSMC.1972.4309137
  • XiaoWZhaoQFeiQA comparative study of data mining methods in consumer loans credit scoring managementJournal of Systems Science and Systems Engineering200615441943510.1007/s11518-006-5023-5
  • Xie H, Han S, Shu X, Yang X, Qu X and Zheng S (2009). Solving credit scoring problem with ensemble learning: A case study. In: Proceedings of the 2nd International Symposium on Knowledge Acquisition and Modeling, Vol. 1, Wuhan, China, pp 51–54.
  • Yang Z, Wang Y, Bai Y and Zhang X (2004). Measuring scorecard performance. In: Proceedings of 4th International Conference on Computational Science, Krakow, Poland, pp 900–906.
  • Yao P (2009). Comparative study on class imbalance learning for credit scoring. In: Proceedings of the 9th International Conference on Hybrid Intelligent Systems, vol. 2, Shenyang, China, pp 105–107.
  • YenS-JLeeY-SUnder-sampling approaches for improving prediction of the minority class in an imbalanced datasetIntelligent Control and Automation, Lecture Notes in Control and Information Sciences2006731740
  • ZarJHBiostatistical Analysis2009

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.