A logistic regression point of view toward loss given default distribution estimation: Quantitative Finance: Vol 18 , No 3

Sample our Economics, Finance,Business & Industry journals, sign in here to start your access, latest two full volumes FREE to you for 14 days

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions
Read this article /doi/full/10.1080/14697688.2017.1310393?needAccess=true

Abstract

We propose a new procedure to estimate the loss given default (LGD) distribution. Owing to the complicated shape of the LGD distribution, using a smooth density function as a driver to estimate it may result in a decline in model fit. To overcome this problem, we first apply the logistic regression to estimate the LGD cumulative distribution function. Then, we convert the result into the LGD distribution estimate. To implement the newly proposed estimation procedure, we collect a sample of 5269 defaulted debts from Moody’s Default and Recovery Database. A performance study is performed using 2000 pairs of in-sample and out-of-sample data-sets with different sizes that are randomly selected from the entire sample. Our results show that the newly proposed procedure has better and more robust performance than its alternatives, in the sense of yielding more accurate in-sample and out-of-sample LGD distribution estimates. Thus, it is useful for studying the LGD distribution.

Keywords:

Conditional distribution
Logistic regression
Loss given default
Sample density estimation
Unconditional distribution

JEL Classification:

Acknowledgements

The authors thank the reviewers for their valuable comments and suggestions that have greatly improved the presentation of this paper. The brief description of our proposed procedure given in the introduction section is provided by the reviewer. This research is supported by the Ministry of Science and Technology, Taiwan, Republic of China.

Disclosure statement

No potential conflict of interest was reported by the authors.

Notes

¹ For studying LGD, there are many other approaches. However, they suffer from different problems and thus are not suitable for estimating the LGD distribution. For example, the LGD distribution estimate produced using the inverse Gaussian regression (Qi and Zhao Citation2011) or beta regression (Yashkir and Yashkir Citation2013) has zero probability masses at boundaries 0 and 1. The Gaussian mixture model (Altman and Kalotay Citation2014) produces a bunch of transformed LGD values at a large negative or positive value. Thus, it is difficult to model the LGD distribution using a Gaussian mixture without facing distributional degeneracy. The ordered logistic regression (Li et al. Citation2014) and ordered probit model (Hwang et al. Citation2016) suffer from the case that some partition cells may have small or zero sizes and the resulting parameter estimates become less precise. Finally, fractional response regression, regression tree, neural network, support vector machine and ensemble model impose even no distributional assumption on LGD data (Bastos Citation2010, Citation2014, Loterman et al. Citation2012, Hartmann-Wendels et al. Citation2014).

² Other link functions, such as the probit and the complementary log–log link functions, can be similarly used to model the probability distribution of Z(w), for each w ∊ [0, 1]. The results based on these link functions may lead to the drawing of similar insights.

³ There are another two approaches that can be used to produce for y ∊ (0, 1). First, we replace the polygon with the histogram. Using the performance metrics in Section 3.2, the histogram has similar performance to the polygon. Second, we apply the kernel estimation method of Wei and Chu (Citation1994) to the data for j = 1, … , q − 1, and to produce a smooth version of But this approach suffers from a heavy computational burden. By these considerations, we use the polygon to present for y ∊ (0, 1).

⁴ Through a straightforward calculation, where and are the average and variance of the given quantities RWMSD_out,k, for k = 1, … , m. The same remark also applies to RMSE_WMAD,out, RMSE_RWMSD,in and RMSE_WMAD,in. By this result, the metric RMSE combines the average and variance of the given performance measures together. Thus, it is useful for measuring the performance of an estimation method over multiple samples.

⁵ For presenting the LGD frequency distribution, the number of categories q = 10 has been used in Bastos (Citation2010, Citation2014), Qi and Zhao (Citation2011) and Altman and Kalotay (Citation2014), q = 20 in Sigrist and Stahel (Citation2011), Yashkir and Yashkir (Citation2013) and Calabrese (Citation2014) and q = 50 in Oliveira et al. (Citation2015).

⁶ This recovery rate truncation approach has been used in Chava et al. (Citation2011), Qi and Zhao (Citation2011), Yashkir and Yashkir (Citation2013) and Altman and Kalotay (Citation2014).

Qi, M. and Zhao, X., Comparison of modeling methods for loss given default. J. Bank. Financ., 2011, 35, 2842–2855.10.1016/j.jbankfin.2011.03.011

A logistic regression point of view toward loss given default distribution estimation

Abstract

Acknowledgements

Disclosure statement

Notes

Log in via your institution

Log in to Taylor & Francis Online

Log in to Taylor & Francis Online

Restore content access

Related Research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature