Full article: Combination of linear discriminant analysis and expert opinion for the construction of credit rating models: The case of SMEs

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

The construction of an internal rating model is the main task for the bank in the framework of the IRB-foundation approach the fact that it is necessary to determine the probability of default by rating class. As a result, several statistical approaches can be used, such as logistic regression and linear discriminant analysis to express the relationship between the default and the financial, managerial and organizational characteristics of the enterprise. In this paper, we will propose a new approach to combine the linear discriminant analysis and the expert opinion by using the Bayesian approach. Indeed, we will build a rating model based on linear discriminant analysis and we will use the bayesian logic to determine the posterior probability of default by rating class. The reliability of experts’ estimates depends on the information collection process. As a result, we have defined an information collection approach that allows to reduce the imprecision of the estimates by using the Delphi method. The empirical study uses a portfolio of SMEs from a Moroccan bank. This permitted the construction of the statistical rating model and the associated Bayesian models; and to compare the capital requirement determined by these models.

Keywords:

JEL classification:

PUBLIC INTEREST STATEMENT

Customer rating is an important tool for determining credit pricing. Indeed, the bank must construct a model capable of determining the real profile of the customer. This article proposes a approach to the conception of a rating model that integrates the quantitative and qualitative data of the company can be of great utility for professionals, student credit risk researchers and academics.

It also meets the need for a portfolio manager by combining their opinion with statistical estimation.

The method we have proposed to combine statistical estimation and expert opinion can be used in other risk management areas such as operational and market risk.

1. Introduction

Internal credit risk rating models are based on the modelling of the three risk components, which are probability of default (PD), Loss given default (LGD) and Exposure at default (EAD). As a result, the bank must estimate the three components for each customer exposure.

To model the probability of default (PD), a multitude of techniques can be used, such as linear discriminant analysis (ADL), the intelligence techniques (neural networks and genetic algorithms), bayesian Network and the probabilistic models.

These techniques are based on different logics and have been the subject of a multitude of research and studies conducted by academics and professionals such as:

–Multidimensional linear discriminant analysis

The prediction of default by linear discriminant analysis was developed by Altman (Citation1968), Taffler (Citation1982), Bardos (Citation1998), Bel et al. (Citation1990) and Grice et al.(Citation2001).

–Intelligence techniques

The several studies have applied these techniques to predict the default of the corporates, such as those conducted by Bell, Ribar, and Verchio (Citation1990), Liang and Wu (Citation2003), Bose and Pal (Citation2006) and Back et al. (Citation1996), and Oreski et al. (Citation2012).

–Bayesian Network

The Bayesian classifier (Friedman, Geiger, & Goldszmidt, Citation1997) is based on the calculation of a posterior probability. The opportunities of using probabilistic Bayesian networks in fundamental financial analysis is studied by Gemela (Citation2001); Das, Fan, and Geng (Citation2002) studied the changes in PDs related to changes in ratings, using a modified Bayesian model to calibrate the historical time series of probability of Default changes to historical rating transition matrices; Dwyer (Citation2006) used the Bayesian approach to propose techniques to facilitate probability of default assessment in the absence of sufficient historical default data; Gôssl (Citation2005) introduced a new Bayesian approach to the credit portfolio, and deduced, within a Bayesian framework, the law a posteriori from the probabilities of default and correlation and Tasche (Citation2013) has describes how to implement the uninformed and conservative Bayesian estimators in the dependent one- and multi-period default data cases and compares their estimates with the upper confidence bound estimates.

–Probabilistic models

The several studies have applied these techniques to predict the default of the corporates such as those conducted by Ohlson (Citation1980), Hunter et al. (Hunter & Isachenkova, Citation2002), Hensher, Jones, William, and Greene (Citation2007), Zmijeriski (Citation1984), Grover and Lavin (Citation2001), Bunn and Redwood (Citation2003) and Benbachir and Habachi (Citation2018).

Our study differs from previous research in that we treated the conception of the rating models of the credit portfolio based first on the multidimensional linear discriminant analysis (LDA), which permitted us to determine the probability of default by class. Then, we determined a mathematical passage that allows us to combine the probability of default from the statistical model and that estimated by the experts using Bayesian logic. Then, we developed an information gathering approach based on the Delphi method to ensure the reliability of the estimates. As a result, our approach tends to be more practical than theoretical and may be of interest to professionals in the field of credit risk management. In summary, in this article, we propose a practical approach with a solid theoretical basis to combine the probability of default emanating from the linear discriminant analysis and that emanating from the experts using Bayesian logic and the Delphi method.

The rest of this paper is as follows. Section 2 is devoted to credit risk measurement. We first give a definition of the credit risk situation. We then define the approaches to credit risk measurement and we defined the unexpected and expected credit loss. The third section is reserved to modelling of the probability of default and the construction of the statistical notation model and associated Bayesian models. The fourth section is reserved to the empirical study.

2. Credit risk measurement

The credit risk situation is composed of the following elements:

Probability of default (PD): Probability that a counterpart falls at default in a horizon one year.
Loss given default ( $L G D$ ): The share, expressed as a percentage of the amount a bank loses when a borrower falls at default on a credit.
Exposure at default ( $E A D$ ): The total value to which a bank is exposed when a credit is at default.
Maturity ( $M$ ): The effective maturity of credit.

2.1. The IRB-foundation approach

The Basel Committee on Banking Supervision(Citation1999, Citation2006, Citation2016) provides for three risk measurement approaches: the standard approach, the foundation internal rating-based foundation approach ( $I R B - F$ ) and the internal rating-based advanced approach ( $I R B - A$ ). In our study, we will measure the risk according to the internal rating-based foundation ( $I R B$ -A). Under this approach, the bank must model the probability of default while the estimate of loss given default, exposure at default and maturity are provided by the Basel accords.

Indeed, for the loss given default ( $L G D$ ) we use the standard estimate provided under the $I R B - F$ approach, which is equal to 45%, while for the exposure at default we proceed as follows:

Let be $V_{e f i}$ the amount of the financing authorization granted by the bank to the customer, The exposure at default $E A D$ is defined as the sum of:

The value accounted for in the balance sheet ( $V C B_{0})$ .
The value of the unused funding commitment, accounted for off-balance sheet ( $V C H B_{0})$ multiplied by a credit conversion factor ( $C C F)$ . The standard estimate of $C C F$ under the $I R B - F$ approach, is equal to 75%.

The mathematical formulation of the $E A D$ is given by the following relationship:

E A D = V C B_{0} + F C C \times V C H B_{0} = V C B_{0} + 0, 75 (V_{e f i} - V C B_{0})

2.2. The expected loss (EL)

The Basel Committee (Citation2015) has established the provision for the expected loss. Indeed, the amount of the expected loss is equal to the multiplication of the three components $P D$ , $L G D$ and $E A D$ :

(1)

E L_{M} = P D \times L G D \times E A D

(1)

2.3. The unexpected loss (UL)

The unexpected loss ( $U L$ ) and the risk-weighted assets are defined by the Basel Accords as follows:

(2)

\{\begin{matrix} U L = K \times E A D \\ R W A = K \times E A D \times 12, 5 \end{matrix}

(2)

The parameter ( $K$ ) called « Capital requirement» represents the weighting function calculated according to the $P D$ , $L G D$ , correlation $(R)$ and the effective maturity $M$ .

In this paper, we will calculate the risk-weighted assets of a portfolio of SMEs. Indeed, the Basel Committee defines the parameter $K$ relating to this segment by:

Capital requirement ( $K)$ :

(3)

K = L G D * (N (\frac{G (P D) + \sqrt{R} G (0, 999)}{\sqrt{1 - R}}) - P D) * (\frac{1 + (M - 2, 5) b}{1 - 1, 5 b})

(3)

with

Maturity adjustment ( $b$ ): $b = {[0, 11852 - 0, 05478 ln (P D)]}^{2}$
G(.) = $N^{- 1} (.)$ and $N (.)$ is the cumulative distribution function for a random variable $N (0, 1)$ .
The correlation $R$ is determined by the following modelFootnote¹:

(4)

R = 0, 12 (\frac{1 - e^{- 50 P D}}{1 - e^{- 50}}) + 0, 24 (1 - \frac{1 - e^{- 50 P D}}{1 - e^{- 50}}) - 0, 04 (1 - \frac{(S - 5)}{45})

(4)

3. The modelling of probability of default

3.1. Constitution and treatment of the database

3.1.1. The constitution of the database (definition of variables)

In our study, we were able to identify 16 quantitative and 19 qualitative variables. The choice of variables is based on current financial analysis practices and the likely impact on business failure.

We present below the selected variables, the explanation and meaning of which are detailed in Appendix 1 (Table A1, Table A2).

–The quantitative variables

The quantitative variables ( $V_{j}$ ), $j = 1, . ., 16$ , are divided into six classes, defined in Table .

–The qualitative variables

The variables $q_{m} (m = 1, \dots ., 19)$ are grouped by theme $T_{k}, k = 1, \dots, 6$ , in Table .

3.1.2. Discretization of qualitative variables and their transformation into a score

For the discretization of the variables we will use the approach proposed in Benbachir and Habachi (Citation2018) which is as follows:

–Discretization of qualitative variables

The qualitative variables ( $q_{m}), 1 \leq m \leq 19$ are discretized into modalities. The number of modalities can be equal at 3 or 5 modalities. The rule of the modalities choice is based on the logical relationship between modalities and default.

–Transformation of quantitative variable into score

Let $(M_{q_{m}, l}), l = 1, . ., l_{q_{m}}$ , be the modalities of the qualitative variable ( $q_{m})$ and ( $l_{q_{m}}$ ) defines the number of modalities ( $l_{q_{m}} \in \{3, 5\}$ ). For each modality, the score varies between 0 and 100 points with a jump of 50 points per modality for the variables at three modalities and a jump of 25 points for the variables at five-modalities. The score taken by the modalities is:

Variables at three modalities : [0, 50,100]

Example: the modalities relating to the sector default rate are: 1—below average, 2—equal to average, 3—above average. In this case, the scores given are, respectively: 100, 50, 0.

Variables at five modalities : [0, 25, 50, 75,100]

Example: the modalities relating to natural risk are: 1—No risk, 2—Low risk and the adequate crisis plan, 3—High risk and the adequate crisis plan, 4—Low risk without crisis plan, 5—High risk without crisis plan. In this case, the scores are, respectively: 100, 75, 50, 25, 0.

The assessment of the logical relationship between the modalities of each variable and the default is determined on the basis of expert opinion.

3.2. Mathematical modelling of default

The default is modeled by a binary variable $Y$ defined as follows:

(4)

Y = \{\begin{matrix} 1 i f t h e c o m p a n y i s h e a l t h y \\ 0 i f t h e c o m p a n y i s i n d e f a u l t \end{matrix}

(4)

The relationship between the variable $Y$ to be explained and the explanatory variables $V_{j}$ and $q_{m}$ is determined by the linear discriminant analysis.

To determine the explanatory variables to be used for modelling, we will use an univariate analysis for each variable in the chosen list. Indeed, the objective of this analysis is to determine the relationship between $Y$ and each of its quantitative and qualitative variables $V_{j}$ et $q_{m}$ .

3.3. The linear discriminant analysis

Linear discriminant analysis provides a method for predicting the failure of a enterprise based on quantitative and qualitative discriminant variables.

In the case of the binary modeling given by the formulation (1), the classification function (score function) relating to a vector of characteristic $x$ is written:

(6)

f (x) = (m_{0} - m_{1}) S^{- 1} (x - \frac{m_{0} + m_{1}}{2})

(6)

with:

$m_{0}$ : is the average point of the group of failing companies.
$m_{1}$ : is the average point of the group of healthy firms.
$S$ : is within-groups variance and covariance matrix.

If $f (x) > s$ the firm is a healthy otherwise the firm is in default. The threshold $(s$ ) was determined by the model. The classification function $f (x)$ can be written:

f (X_{1}, X_{2}, \dots \dots, X_{p}) = β_{0} + β_{1} X_{1} + β_{2} X_{2} + \dots \dots + β_{p} X_{p}

From $X_{i}$ , $i = 1, . ., p$ , are the quantitative and qualitative discriminating variables and $β_{i}$ are the discriminating coefficients. The linear discriminant analysis is based on the following assumptions:

The discriminating variables should not be overly correlated therebetween.
The discriminating variables derives from a population with Gaussian distribution.
The covariance matrices must be equal for each group.

3.3.1. Choice of variables

The choice of discriminant variables to be used is based on the univariate analysis. Indeed, the discriminant variables must verify the hypothesis of equality of group means is true. The statistical test for equality of group means is in Table .

3.3.2. Testing of the significance of the coefficients

The validation of the multivariate model depends on the following significance tests:

–The Box’s M test (the groups covariance matrices are all equal)

The Box’s M test is used to check whether two or more covariance matrices are equal (Homogeneity of variances). The null hypothesis $H_{0}$ that “The groups covariance matrices are all equal” and the test statistics are defined by:

M = (n - 2) ln (|S|) - \sum_{i = 1}^{2} (n_{i} - 2) ln (|S_{i}|)

with:

$n = n_{1} + n_{2}$ : The sum of the populations of two groups.
$S_{i}$ is the estimate of the covariance of the variables in the group ( $i$ ): $S = \frac{\sum_{i = 1}^{2} (n_{i} - 1) S_{i}}{(n - 2)}$

The decision of the test depends on the size of the group $n_{i}$ and the number of discriminating variables because the statistic can be a chi-square law or a Fisher law. Therefore, if the p-value is inferior to 5%, $H_{0}$ is rejected.

–Tests relating to the predictive capacity of the classification function (score function)

To test the predictive capacity of the classification function, we use Wilks’ lambda. Indeed, the test statistic is defined in Table .

3.3.3. The confusion matrix

To ensure that the discriminant function provides a good classification of companies into subgroups, we use the confusion matrix defined in the Table .

This matrix permits to determine the capacity of the model to correctly classify the firm. Indeed, it is measured by the ratio: $\frac{n_{10} + n_{21}}{n_{1} + n_{2}}$ .

This capacity is confirmed by the test $Q_{p r e s s e}$ defined in Giannelloni and Vernette (Citation2001). The hypothesis $H_{0}$ is defined by « the equality of the number of individuals correctly classified by the discriminating function and by hazard ».

The test statistic is: $Q_{p r e s s e} = \frac{{(n - (n_{c} * k))}^{2}}{n (k - 1)}$ , for $k = 2$ we are $Q_{p r e s s e} = \frac{{(n - (2 * n_{c})}^{2}}{n}$

with: $n$ is the total number of companies, $n_{c}$ is the number of companies correctly classified and $k$ is the number of groups.

The statistic $Q_{p r e s s e}$ is chi-square law ( $χ^{2}$ ) at 1(one) degree of freedom. Indeed, if the p-value is inferior to 5%, $H_{0}$ is rejected.

3.3.4. The affectation threshold

The decision to affect a company’s allocation is based on the affectation threshold defined by the functions at group centroids. The separation of groups is defined in Table .

The optimal separation point is the weighted mean of the values of $α$ and $β$ ( $\frac{n_{1} α + n_{2} β}{n_{1} + n_{2}}$ ). However, if both groups are the same size ( $n_{1} = n_{2})$ the separation point will be the arithmetic mean of $α$ and $β$ ( $\frac{α + β}{2})$ .

3.3.5. Discriminatory power (power stat)

The discriminatory power represents the model’s ability to predict future situations. We will use the $R O C$ curve to determine the discriminatory power of the model. The determination of the $R O C$ curve will be done from the classification table of the sample of estimation of the variable $Y$ which is presented in Table .

One indicates by sensitivity ( $S V$ ), the proportion of the healthy companies classified well: $S V = \frac{T H}{T H + F H}$ and by specificity ( $S P$ ), the proportion of the de companies is in default, classified well: $S P = \frac{F D}{F D + T D}$

If one varies the “probability threshold” from which it is considered that a company must be regarded as healthy, the sensitivity and specificity varies. The curve of the points ( $1 - S P, S V$ ) is the $R O C$ curve.

Definition of the area under the $R O C$ curve ( $A U C)$ and the Accuracy ratio ( $A R$ )
- The area under the $R O C$ curve $(A U C)$

The area under the $R O C$ curve ( $A U C$ ) provides an overall measure of model fit (Bewick, Cheek, and Ball (Citation2004). The $A U C$ varies from 0,5 (predictive capacity absence for the model) to 1 (perfect predictive aptitude for the model).

Accuracy ratio ( $A R$ )

The accuracy ratio is defined by the relationship:

(7)

A R = 2 A U C - 1

(7)

The $A R$ takes values between 0 and 1.

–The determination of explanatory variables

To determine the explanatory variables to be retained for modelling, we will carry out a univariate linear discriminant analysis for each variable in the chosen list.

After selecting the explanatory variables on the basis of the decision rules mentioned above, we will study the correlation between the selected variables. The study of correlations makes it possible to eliminate strongly correlated variables. Indeed, if two or more variables have a correlation coefficient superior to 0,5 $(ρ \geq 0, 5)$ then the variable that represents the greatest $A U C$ will be selected.

–The performance of the multivariate model

The discriminating capacity of the multivariate model is considered acceptable if the $A U C$ is greater than 70%.

3.3.6. The canonical discriminant function

The canonical discriminant function, presented in Klecka (Citation1980), is a linear combination of the discriminant variables. Indeed, it has the following mathematical form:

(8)

f = u_{0} + u_{1} X_{1} + u_{2} X_{2} + \dots \dots + u_{p} X_{p}

(8)

with $X_{i}$ are the discriminating variables and $u_{i}$ are the canonical coefficients.

The maximum number of canonical functions is equal to the $m i n (k - 1, p)$ with k is the number of classes and p is the number of discriminating variables.

The canonical coefficients are determined in such a way as to maximize the distance between the group centroids. The canonical discriminant functions can be used to predict the most probable class of membership of an invisible case.

The discriminating canonical analysis is detailed in Palm (Citation1999) and Klecka (Citation1980). Indeed, it is similar to the main component analysis in that it replaces the initial discriminating variables with uncorrelated canonical variables as a linear combination of the initial variables.

3.4. The construction of the rating model

The conception of the rating model is based on the score function because the probability of default ( $P D$ ) depends on the score attributed by the statistical model. Therefore, the conception process is as follows:

3.4.1. The determination of the score function by linear discriminant analysis

The modeling of default by linear discriminant analysis is done by the simultaneous treatment of quantitative and qualitative variables. Indeed, let be $X_{j}, j = 1, \dots, r$ , and $T_{i}, i = 1, \dots, 6$ , respectively, the quantitative and qualitative variables retained by the univariate linear discriminant analysis noted, respectively, $X$ and $T$ .

The score function of the linear discriminant analysis is defined by the relationship:

f (X_{1}, \dots \dots, X_{r}, T_{1}, T_{2}, \dots, T_{6}) = β_{0} + β_{1} X_{1} + \dots \dots + β_{r} X_{r} + β_{r + 1} T_{1} + \dots + β_{r + 6} T_{6}

we note the function $f (X_{1}, X_{2}, \dots \dots, X_{r}, T_{1}, T_{2}, \dots, T_{6})$ by $f (X, T)$

3.4.2. Determination of the rating grid

The determination of the rating grid consists in determining the score interval for each class. Indeed, the standardized score is defined over an interval of 0 to 100 ([0,100]). This interval will be segmented into eight (8) classes to determine the rating classes.

3.4.3. The prediction of healthy firms by the linear discriminant analysis

The prediction of healthy firms by the linear discriminant analysis is based on the function at group centroid defined by Table . Indeed, let $x_{j}, j = 1, \dots, r$ and $t_{i}, i = 1, \dots, 6$ , be, the characteristics of the firm ( $i$ ). This firm is considered healthy if:

\begin{aligned} f (x_{1}, \dots \dots, x_{r}, t_{1},, \dots, t_{6}) > β \\ \Rightarrow β_{0} + β_{1} x_{1} + \dots \dots + β_{r} x_{r} + β_{r + 1} t_{1} + \dots + β_{r + 6} t_{6} > β \end{aligned}

Combination of linear discriminant analysis and expert opinion for the construction of credit rating models: The case of SMEs

Abstract

PUBLIC INTEREST STATEMENT

1. Introduction

2. Credit risk measurement

2.1. The IRB-foundation approach

2.2. The expected loss (EL)

2.3. The unexpected loss (UL)

3. The modelling of probability of default

3.1. Constitution and treatment of the database

3.1.1. The constitution of the database (definition of variables)

3.1.2. Discretization of qualitative variables and their transformation into a score

3.2. Mathematical modelling of default

3.3. The linear discriminant analysis

3.3.1. Choice of variables

3.3.2. Testing of the significance of the coefficients

3.3.3. The confusion matrix

3.3.4. The affectation threshold

3.3.5. Discriminatory power (power stat)

3.3.6. The canonical discriminant function

3.4. The construction of the rating model

3.4.1. The determination of the score function by linear discriminant analysis

3.4.2. Determination of the rating grid

3.4.3. The prediction of healthy firms by the linear discriminant analysis

Table 1. The list of quantitative variables

Table 2. The list of qualitative variables

Table 3. Univariate analysis and choice of discriminant variables

Table 4. Wilks’ lambda

Table 5. The Confusion matrix

Table 6. Functions at group centroids

Table 7. The classification table

Table 8. The rating grid

Table 9. The probability of default by class

Table 10. Rating of the scoring criteria for credit portfolio managers

Table 11. Weighting of credit portfolio managers as a function of the score

Table 12. The portfolio structure in terms of default

Table 13.. (VCB0) and (Vefi) by rating class

Table 14. Non-discriminatory variables

Table 15. The quantitative and qualitative discriminating variables

Table 16. Analysis of the correlation of quantitative discriminant variables

Table 17. Results of the Box’s M test

Table 18. Wilks’ lambda

Table 19. The Confusion matrix

Table 20. The canonical correlation of variables

Table 21. The functions at group centroids

Table 22. Rating model based on linear discriminant analysis

Table 23. Explicit estimation of the probability of default by class according to experts

Table 24. Estimate of expected losses by class according to experts

Table 25. The probability of default of the experts retained for the modelling

Table 26. Bayesian rating models

Table 27. The unexpected loss according to the LDA model

Table 28. The Bayesian unexpected loss

3.4.4. The rating grid

3.4.5. Calculation of the rating score

3.5. Calculation of the probability of default per rating class

3.5.1. Theoretical calculation of the default probability of the rating class (K)

3.5.2. Empirical calculation of the probability of default

3.6. Bayesian approach to the conception of rating models

3.6.1. Definition of the Bayesian approach

3.6.2. Calculation of the probability of default by the Bayesian approach

3.6.3. Definition of the prior law π(θ) of θ theta

3.6.4. Determination of the posterior law π(θ/R) of θ theta

3.6.5. Determination of the bayesian estimator of the parameter θ

3.6.6. Determination of the bayesian default probability by class K

3.7. The implementation of the Bayesian approach

3.7.1. Estimation by experts of the probability of default by class PDe,K

3.7.1.1. Explicit estimation of the probability of default

3.7.1.2. Implicit estimation of the probability of default

3.7.2. Estimation the weighting of the expert opinion ε epsilon

3.7.3. Definition of the interveners

3.7.4. Choice of interveners

3.7.4.1. Choice of experts

3.7.4.2. Choice of evaluator

3.7.5. The conduct of collection of data in experts

4. Empirical study

4.1. Description of the database

4.2. Choice of quantitative and qualitative variables

4.2.1. Univariate discriminant analysis

4.2.2. Analysis of the correlation

4.2.3. Multivariate analysis and determination of the classification function

Table 13.. ( $V C B_{0}$ ) and ( $V_{e f i}$ ) by rating class

Table 27. The unexpected loss according to the $L D A$ model

3.7.1. Estimation by experts of the probability of default by class PD_e,K