426
Views
0
CrossRef citations to date
0
Altmetric
Research Papers

A logistic regression point of view toward loss given default distribution estimation

&
Pages 419-435 | Received 10 Oct 2016, Accepted 17 Mar 2017, Published online: 18 Jul 2017
 

Abstract

We propose a new procedure to estimate the loss given default (LGD) distribution. Owing to the complicated shape of the LGD distribution, using a smooth density function as a driver to estimate it may result in a decline in model fit. To overcome this problem, we first apply the logistic regression to estimate the LGD cumulative distribution function. Then, we convert the result into the LGD distribution estimate. To implement the newly proposed estimation procedure, we collect a sample of 5269 defaulted debts from Moody’s Default and Recovery Database. A performance study is performed using 2000 pairs of in-sample and out-of-sample data-sets with different sizes that are randomly selected from the entire sample. Our results show that the newly proposed procedure has better and more robust performance than its alternatives, in the sense of yielding more accurate in-sample and out-of-sample LGD distribution estimates. Thus, it is useful for studying the LGD distribution.

JEL Classification:

Acknowledgements

The authors thank the reviewers for their valuable comments and suggestions that have greatly improved the presentation of this paper. The brief description of our proposed procedure given in the introduction section is provided by the reviewer. This research is supported by the Ministry of Science and Technology, Taiwan, Republic of China.

Disclosure statement

No potential conflict of interest was reported by the authors.

Notes

1 For studying LGD, there are many other approaches. However, they suffer from different problems and thus are not suitable for estimating the LGD distribution. For example, the LGD distribution estimate produced using the inverse Gaussian regression (Qi and Zhao Citation2011) or beta regression (Yashkir and Yashkir Citation2013) has zero probability masses at boundaries 0 and 1. The Gaussian mixture model (Altman and Kalotay Citation2014) produces a bunch of transformed LGD values at a large negative or positive value. Thus, it is difficult to model the LGD distribution using a Gaussian mixture without facing distributional degeneracy. The ordered logistic regression (Li et al. Citation2014) and ordered probit model (Hwang et al. Citation2016) suffer from the case that some partition cells may have small or zero sizes and the resulting parameter estimates become less precise. Finally, fractional response regression, regression tree, neural network, support vector machine and ensemble model impose even no distributional assumption on LGD data (Bastos Citation2010, Citation2014, Loterman et al. Citation2012, Hartmann-Wendels et al. Citation2014).

2 Other link functions, such as the probit and the complementary log–log link functions, can be similarly used to model the probability distribution of Z(w), for each w ∊ [0, 1]. The results based on these link functions may lead to the drawing of similar insights.

3 There are another two approaches that can be used to produce for y ∊ (0, 1). First, we replace the polygon with the histogram. Using the performance metrics in Section 3.2, the histogram has similar performance to the polygon. Second, we apply the kernel estimation method of Wei and Chu (Citation1994) to the data for j = 1, … , q − 1, and to produce a smooth version of But this approach suffers from a heavy computational burden. By these considerations, we use the polygon to present for y ∊ (0, 1).

4 Through a straightforward calculation, where and are the average and variance of the given quantities RWMSDout,k, for k = 1, … , m. The same remark also applies to RMSEWMAD,out, RMSERWMSD,in and RMSEWMAD,in. By this result, the metric RMSE combines the average and variance of the given performance measures together. Thus, it is useful for measuring the performance of an estimation method over multiple samples.

5 For presenting the LGD frequency distribution, the number of categories q = 10 has been used in Bastos (Citation2010, Citation2014), Qi and Zhao (Citation2011) and Altman and Kalotay (Citation2014), q = 20 in Sigrist and Stahel (Citation2011), Yashkir and Yashkir (Citation2013) and Calabrese (Citation2014) and q = 50 in Oliveira et al. (Citation2015).

6 This recovery rate truncation approach has been used in Chava et al. (Citation2011), Qi and Zhao (Citation2011), Yashkir and Yashkir (Citation2013) and Altman and Kalotay (Citation2014).

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 691.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.