ABSTRACT
The default risk, one of the main risk factors for bonds, should be measured and reflected in the bond yield. Particularly, in the case of financial companies that treat bonds as a major product, failure to properly identify and filter customers’ workout status adversely affects returns. This study proposes a two-stage classification algorithm for workout prediction based on the history data of individual customers such as transaction details of financial companies secured after loans, which is collected over 10 years. The first stage is to rank variables that are closely related to the workout application based on feature selection. In the second step, the first to nth cumulative variables input to each machine learning method generate n candidate classifiers, respectively. Among the total candidates, the model with the highest classification accuracy was selected as the optimal one, which is the Gradient Boost combined with F-score-based feature selection.
Author contribution
S.K.—research idea, formulation of research goals and objectives, guidance and consulting, examination of calculation results. Y.N.—analysis of literature, analysis of experimental data, validation of the model, draft and final copy of the manuscript. Y.Y.—research idea, analysis of experimental data, literature analysis. All authors have read and agreed to the published version of the manuscript.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Supplementary data
Supplemental data for this article can be accessed online at https://doi.org/10.1080/15366367.2023.2246109.