Full article: Generalized two-parameter estimators in the multinomial logit regression model: methods, simulation and application

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

In this article, we propose generalized two-parameter (GTP) estimators and an algorithm for the estimation of shrinkage parameters to combat multicollinearity in the multinomial logit regression model. In addition, the mean squared error properties of the estimators are derived. A simulation study is conducted to investigate the performance of proposed estimators for different sample sizes, degrees of multicollinearity, and the number of explanatory variables. Swedish football league dataset is analyzed to show the benefits of the GTP estimators over the traditional maximum likelihood estimator (MLE). The empirical results of this article revealed that GTP estimators have a smaller mean squared error than the MLE and can be recommended for practitioners.

Keywords:

1. Introduction

The multinomial logit regression (MNLR) model introduced by Luce (Citation1959) and is often used when the dependent variable comprises more than two categories. Mostly, the MNLR is used to model nominal output variables in which the log odds of the outputs are modeled as a linear combination of explanatory variables. Nowadays, the MNLR is a very common choice among applied researchers for analyzing the categorical response variable having at least three categories. For example, blood types of humans and animals give different diagnostic tests, determination of socioeconomic factors that affect major choices made by consumers, getting unrestricted credit, restricted credit, or no credit by different corporations due to their financial and official characteristics and among others. It is a general practice to use an ordinary maximum likelihood estimator (MLE) to estimate the parameters vector of the MNLR. Multicollinearity is a situation in MNLR where two or more explanatory variables are highly correlated with each other. The problem of multicollinearity can have numerous adverse effects on regression coefficients. The main drawback of multicollinearity is that the variance of the MLE becomes inflated (Månsson et al., Citation2018). In addition, on average estimates are too large and can have wrong signs of the estimated parameters. Consequently, the probability of type-II error of estimated parameter will be increased, which result in decreased statistical power; the Wald statistic gives insignificant results in the presence of multicollinearity (Qasim, Kibria, et al. Citation2020a).

Different biased estimation methods have been proposed to solve the problem of multicollinearity for a different type of regression models. Some of the biased estimation methods for popular regression models are: Hoerl and Kennard (Citation1970) proposed a ridge regression (RR) estimator to deal the problem of multicollinearity for the linear regression model (LRM). Schaefer, Roi, and Wolfe (Citation1984), Schaefer (Citation1986) introduced ridge regression and Stein estimators for the logistic regression model. Kejian (Citation1993, 2003) proposed a Liu and Liu-type estimators for the LRM. Månsson and Shukur (Citation2011a) and Månsson, Kibria, and Shukur (Citation2012) developed ridge regression and Liu estimators for the logit regression model, respectively. Månsson and Shukur (Citation2011b), Qasim, Kibria, et al. (Citation2020a), Qasim, Månsson, et al. (Citation2020b), Lukman et al. (Citation2020) and Noori Asl et al. (Citation2020) suggested Poisson RR, Poisson Liu regression, biased adjusted Poisson RR, modified Poisson ridge-type, and penalized and ridge-type shrinkage estimators in the Poisson regression model, respectively. Månsson (Citation2012, Citation2013) introduced ridge negative binomial ridge and Liu regression estimators, respectively. Kurtoğlu and Özkale (Citation2016), Qasim, Amin, and Amanullah (Citation2018), Mandal et al. (Citation2019), Amin, Qasim, Yasin, et al. (Citation2020b), Amin, Qasim, Amanullah, et al. (Citation2020a) and Lukman et al. (Citation2020) developed Liu estimation, Liu shrinkage parameters, Stein-type shrinkage estimators, bias and almost unbiased RR and modified ridge-type estimators for the gamma regression, respectively. Karlsson, Månsson, and Kibria (Citation2020) and Qasim, Månsson, and Golam Kibria (Citation2021) proposed beta Liu and ridge regression estimators, respectively.

Özkale and Kaciranlar (Citation2007) proposed a two-parameter biased estimator by grafting contraction estimator into modified RR estimator in the LRM. They stated that as the value of k (ridge parameter) becomes increases then the RR has an unnecessary amount of bias. Yang and Chang (Citation2010) suggested another efficient two-parameter estimator. Toker, Üstündağ Şiray, and Qasim (Citation2019) proposed a first-order two-parameter estimator in the generalized linear models (GLM). Regarding the considerable literature on a two-parameter estimator for a different form of the GLM, we refer to Huang and Yang (Citation2014), Asar and Genç (Citation2018), Abonazel and Farghali (Citation2019), Rady, Abonazel, and Taha (Citation2019), Amin, Qasim, and Amanullah (Citation2019), Farghali (Citation2019), Akram, Amin, and Qasim (Citation2020), Naveed et al. (Citation2020) and among others. El-Dash, El-Hefnawy, and Farghali (Citation2011) proposed Stein-type ridge regression and generalized ridge regression estimators for the MNLR and Månsson et al. (Citation2018) suggested ridge estimators for the MNLR. By extending the work of Månsson, Kibria, and Shukur (Citation2012), Farghali (Citation2014) proposed a multinomial Liu estimator. Asl et al. (Citation2021) proposed the Stein-type shrinkage ridge estimator when the regression coefficients are restricted to a linear subspace in the GLM and derive the asymptotic biases and risks of the proposed estimators.

The main purpose of this article is to develop two different generalized two-parameter (GTP) estimators for the MNLR by following the work of Abonazel and Farghali (Citation2019) and Yang and Chang (Citation2010). In addition, we suggest a new algorithm for the selection of optimal shrinkage parameters $(k_{j} and d_{j})$ for GTP estimators. The proposed GTP estimators are general estimators which include the MLE, RR, generalized RR, Liu estimator, and generalized Liu-type estimator. The mean square error (MSE) properties of the estimators are examined and show the superiority of the proposed estimators under certain conditions. Besides, the Monte Carlo simulation and Swedish football league dataset are analyzed to demonstrate the benefit of the proposed estimators over the existing classical MLE.

The rest of the article is arranged as follows: The MNLR model, GTP estimators and MSE properties are well defined in Sec. 2. Estimation methods for selection of generalized optimal shrinkage parameters are explained in Sec. 3. Simulation experiment and their results are discussed in Sec. 4. The advantage of our recommended estimators is demonstrated by analyzing empirical application in Sec. 5 and finally, some concluding remarks are given in Sec 6.

2. Model and proposed GTP estimators

This section illustrates the methodology of the MNLR and MLE. In addition, we propose two new GTP estimators by considering the work of Abonazel and Farghali (Citation2019), and Huang and Yang (Citation2014) in the MNLR model.

2.1. Multinomial logit regression

The multinomial logistic regression model assumes that the response variable $y$ has a multinomial distribution. Consider a trial that results in exactly one of some fixed finite number of $C$ possible outcomes, with probabilities $π_{1},$ $π_{2},$ … $, π_{C},$ $0 \leq π_{h} \leq 1,$ $h = 1, 2, \dots, C$ and $\sum_{h = 1}^{C} π_{h} = 1$ . The random variable $y_{i} = (y_{i 1}, y_{i 2}, . {. y}_{i h} \dots, y_{i C})$ for n independent trials represents the multinomial trial for observation $i (i = 1, 2, \dots, n)$ and $y_{i h}$ takes the following values: $y_{i h} = {\begin{matrix} 1 & the {i th}^{} observation \in the {h th}^{} level \\ 0 & otherwise \end{matrix}, i = 1, 2, \dots, n, h = 1, \dots, C$

Thus, $\sum_{h = 1}^{C} y_{i h} = 1 .$ The probability density function of multinomial distribution is defined as (1) $f (y_{i 1}, \dots, y_{i C}) = \frac{n!}{y_{i 1}! \dots, y_{i C}!} π_{1} (x_{i})^{y_{i 1}} . π_{h} (x_{i})^{y_{i h}} . π_{C} (x_{i})^{y_{i C}},$ (1) where $\sum_{h = 1}^{C} y_{i h} = 1 ∴ y_{i C} = 1 - \sum_{h = 1}^{C - 1} y_{i h} .$ Using the value of $y_{i C}$ in (1) the density function can be written as $f (y_{i 1}, \dots, y_{i C}) = \frac{n!}{y_{i 1}! ., (1 - \sum_{h = 1}^{C - 1} y_{i h})!} π_{1} (x_{i})^{y_{i 1}} . . π_{h} (x_{i})^{y_{i h}} . . π_{C} (x_{i})^{(1 - \sum_{h = 1}^{C - 1} y_{i h})} .$

To develop the MNLR, assume that the explanatory variables $x_{1}, x_{2}, \dots, x_{p}$ are measured for observations $y_{i 1}, \dots, y_{i C}$ which follow a multinomial distribution with probability parameters $π_{1},$ $π_{2},$ … $, π_{C} .$ Then the conditional probabilities of each response variables’ level given the explanatory variables vector are $π_{h} (x_{i}) = \Pr (Y = h | x_{1}, x_{2}, \dots, x_{p}) = \frac{e^{β_{h 0} + \sum_{j = 1}^{p} β_{h j} x_{i j}}}{1 + \sum_{h = 1}^{C - 1} e^{β_{h 0} + \sum_{j = 1}^{p} β_{h j} x_{i j}}}, h = 1, 2, \dots, C - 1$ $π_{C} (x_{i}) = \frac{1}{1 + \sum_{h = 1}^{C - 1} e^{β_{h 0} + \sum_{j = 1}^{p} β_{h j} x_{i j}}} .$

Thus, the MNLR is (2) $\ln [\frac{π_{h} (x_{i})}{π_{C} (x_{i})}] = β_{h 0} + \sum_{j = 1}^{p} β_{h j} x_{i j}, h = 1, 2, \dots, C - 1, i = 1, 2, \dots, n .$ (2) where $n$ is the sample size, $p$ is the number of explanatory variables, $C$ is the number of levels of the response variable (in model (2) level $C$ is the reference level), $β_{h 0}, β_{h j} are the regression$ parameters at the $h^{t h}$ level of the response variable.

The most common method for estimating the model parameters is to apply the MLE, which maximizes the log-likelihood function

L = \prod_{i = 1}^{n} \frac{n!}{y_{i 1}! \times y_{i 2}! \times . . . \times (1 - \sum_{h = 1}^{C - 1} y_{i h})!} {[\frac{{\hat{π}}_{1} (x)}{{\hat{π}}_{C} (x)}]}^{y_{i 1}} \times {[\frac{{\hat{π}}_{2} (x)}{π_{C} (x)}]}^{y_{i 2}} \times \dots \times {[\frac{{\hat{π}}_{h} (x)}{{\hat{π}}_{C} (x)}]}^{y_{i h}} \dots \times {\hat{π}}_{C} (x) .

\ln L = \sum_{i = 1}^{n} [\ln (\frac{n!}{y_{i 1}! \times . . . \times (1 - \sum_{h = 1}^{C - 1} y_{i h})!}) + y_{i 1} \ln [\frac{{\hat{π}}_{1} (x)}{{\hat{π}}_{C} (x)}] + y_{i 2} \ln [\frac{{\hat{π}}_{2} (x)}{{\hat{π}}_{C} (x)}] + . . . + \ln {\hat{π}}_{C} (x)] .

Let $A = \ln (\frac{n!}{y_{i 1}! \times y_{i 2}! \times . . . \times (1 - \sum_{h = 1}^{C - 1} y_{i h})!})$ .

Then $\ln L = \sum_{i = 1}^{n} \sum_{h = 1}^{C - 1} [A + y_{i 1} \ln [\frac{{\hat{π}}_{1} (x)}{{\hat{π}}_{C} (x)}] + y_{i 2} \ln [\frac{{\hat{π}}_{2} (x)}{{\hat{π}}_{C} (x)}] + . . . + \ln {\hat{π}}_{C} (x)]$ $\ln L = \sum_{i = 1}^{n} \sum_{h = 1}^{C - 1} (A + y_{i h} \ln (\frac{π_{h} (x_{i})}{π_{C} (x_{i})}) + \ln (π_{C} (x_{i})))$ (3) $\ln L = \sum_{i = 1}^{n} \sum_{h = 1}^{C - 1} [{A + y}_{i h} (β_{h 0} + \sum_{j = 1}^{p} β_{h j} x_{i j}) + \ln (\frac{1}{1 + \sum_{h = 1}^{C - 1} e^{β_{h 0} + \sum_{j = 1}^{p} β_{h j} x_{i j}}})]$ (3)

The MLE can be found by setting the first derivative of (3) to zero. Thus, ${\hat{β}}_{MLE}$ is obtained by solving the following nonlinear system of equations: (4) $\sum_{i = 1}^{n} y_{i h} = \sum_{i = 1}^{n} (\frac{e^{β_{h 0} + \sum_{j = 1}^{p} β_{h j} x_{i j}}}{1 + \sum_{h = 1}^{C - 1} e^{β_{h 0} + \sum_{j = 1}^{p} β_{h j} x_{i j}}}), h = 1, \dots, C - 1 .$ (4) (5) $\sum_{i = 1}^{n} y_{i h} x_{i j} = \sum_{i = 1}^{n} (\frac{e^{β_{h 0} + \sum_{j = 1}^{p} β_{h j} x_{i j}}}{1 + \sum_{h = 1}^{C - 1} e^{β_{h 0} + \sum_{j = 1}^{p} β_{h j} x_{i j}}}) x_{i j}, h = 1, \dots, C - 1 .$ (5)

The solution to EquationEquations (4)(4) $\sum_{i = 1}^{n} y_{i h} = \sum_{i = 1}^{n} (\frac{e^{β_{h 0} + \sum_{j = 1}^{p} β_{h j} x_{i j}}}{1 + \sum_{h = 1}^{C - 1} e^{β_{h 0} + \sum_{j = 1}^{p} β_{h j} x_{i j}}}), h = 1, \dots, C - 1 .$ (4) and Equation(5)(5) $\sum_{i = 1}^{n} y_{i h} x_{i j} = \sum_{i = 1}^{n} (\frac{e^{β_{h 0} + \sum_{j = 1}^{p} β_{h j} x_{i j}}}{1 + \sum_{h = 1}^{C - 1} e^{β_{h 0} + \sum_{j = 1}^{p} β_{h j} x_{i j}}}) x_{i j}, h = 1, \dots, C - 1 .$ (5) is found by applying numerical methods such as Newton- Raphson or scoring method see Agresti (Citation2013) and Hosmer, Lemeshow, and Sturdivant (Citation2013) and $the$ asymptotic variance-covariance matrix of ${\hat{β}}_{MLE}$ is calculated as follows: $Cov ({\hat{β}}_{MLE}) = {(X^{'} \hat{W} X)}^{- 1} = (\begin{matrix} X' {\hat{w}}_{11} X & \begin{matrix} - X' {\hat{w}}_{12} X & \dots \end{matrix} & - X' {\hat{w}}_{1 (C - 1)} X \\ ⋮ & ⋱ & ⋮ \\ - X' {\hat{w}}_{(C - 1) 1} X & \begin{matrix} - X' {\hat{w}}_{(C - 1) 2} X & \dots \end{matrix} & X' {\hat{w}}_{(C - 1) (C - 1)} X \end{matrix})$ where $X$ is a $n \times p^{*}$ matrix of the explanatory variables which the first column contains one and $p^{*} = p + 1,$ ${\hat{w}}_{h h}$ is a $(n \times n)$ diagonal matrix with its general element ${\hat{π}}_{h} (x_{i}) [1 - {\hat{π}}_{h} (x_{i})], h = 1, \dots, C - 1, i = 1, 2, \dots, n, {\hat{W}}_{h h'}$ is an $(n \times n)$ diagonal matrix with its general element ${\hat{π}}_{h} (x_{i}) {\hat{π}}_{h} (x_{i}),$ $h, h' = 1, \dots, C - 1, h \neq h' .$ The scalar mean squared error $(MSE)$ of ${\hat{β}}_{MLE}$ is obtained as: $MSE ({\hat{β}}_{MLE}) = t r {(X' \hat{W} X)}^{- 1} = \sum_{j = 1}^{p^{*}} \sum_{h = 1}^{C - 1} \frac{1}{λ_{h j}},$ where $λ_{h j}$ is the $j^{}$ th eigenvalue at the $h^{}$ th level of the response variable.

2.2. New generalized two-parameter estimators

2.2.1. Generalized Liu-type multinomial logistic estimator

Hoping that the combination of two different estimators might inherit the advantages of both estimators, and following the work of Abonazel and Farghali (Citation2019) and Farghali (Citation2019), we propose generalized Liu-type estimator for the MNLR as follows (6) ${\hat{β}}_{GLT - M} = {(X^{'} \hat{W} X + K)}^{- 1} (X^{'} \hat{W} X + K D) {\hat{β}}_{MLE},$ (6) where $K = diag {k_{h j}}; k_{h j} \geq 0,$ and $D = diag {d_{h j}}; {0 \leq d}_{h j} < 1, j = 1, 2, \dots, p^{*}, h = 1, 2, ., C - 1 .$ The variance-covariance matrix of ${\hat{β}}_{GLT - M}$ is defined as $Cov ({\hat{β}}_{GLT - M}) = {(X^{'} \hat{W} X + K)}^{- 1} (X^{'} \hat{W} X + K D) {(X^{'} \hat{W} X)}^{- 1} (X^{'} \hat{W} X + K D) {(X^{'} \hat{W} X + K)}^{- 1}$

Lemma 1.

The proposed biased estimator in (6) represents a general estimator which includes, the MLE, the generalized RR and Liu estimators as special cases

$\lim_{D \to O} {\hat{β}}_{GLT - M} \to {(X' \hat{W} X + K)}^{- 1} (X' \hat{W} X) {\hat{β}}_{MLE} = {\hat{β}}_{GRE};$ the generalized RR estimator.
$\lim_{K \to I} {\hat{β}}_{GLT - M} \to {(X^{'} \hat{W} X + I)}^{- 1} (X^{'} \hat{W} X + D) {\hat{β}}_{MLE} = {\hat{β}}_{GLTE};$ the generalized Liu estimator.
$\lim_{K \to O} {\hat{β}}_{GLT - M} \to {\hat{β}}_{MLE};$ the MLE.

To provide the explicit form of $MSE ({\hat{β}}_{GLT - M}),$ we use the following transformations, suppose that there exists a matrix $Ψ$ such that: (7) $Ψ (X' \hat{W} X) Ψ^{'} = Λ = diag {λ_{h j}},$ (7) where $λ_{11} \geq λ_{12} \geq \dots \geq λ_{(C - 1) p^{*}}$ are the ordered eigenvalues of the matrix $(X' \hat{W} X)$ and $Ψ$ is a $(C - 1) p^{*} \times p^{*}$ orthogonal matrix whose columns are the corresponding eigenvectors of $λ_{11}, λ_{12}, \dots, λ_{(C - 1) p^{*}},$ so that the suggested biased estimator can be defined as: (8) ${\hat{α}}_{GLT - M} = {(Λ + K)}^{- 1} (Λ + K D) {\hat{α}}_{MLE} .$ (8)

Note that, ${\hat{β}}_{MLE} = Ψ' {\hat{α}}_{MLE}$ and ${\hat{β}}_{GLT - M} = Ψ' {\hat{α}}_{GLT - M} .$

The scalar MSE for ${\hat{β}}_{GLT - M}$ is computed as: (9) $MSE ({\hat{β}}_{GLT - M}) = \sum_{j = 1}^{p^{*}} \sum_{h = 1}^{C - 1} \frac{{(λ_{h j} + k_{h j} d_{h j})}^{2}}{{(λ_{h j} + k_{h j})}^{2} λ_{h j}} + \sum_{j = 1}^{p^{*}} \sum_{h = 1}^{C - 1} \frac{α_{h j}^{2} k_{h j}^{2} {[1 - d_{h j}]}^{2}}{{(λ_{h j} + k_{h j})}^{2}} .$ (9) (10) $MSE ({\hat{β}}_{GLT - M}) = γ_{1} (k_{h j}, d_{h j}) + γ_{2} (k_{h j}, d_{h j}) .$ (10)

It is obvious that $γ_{1} (k_{h j}, d_{h j}) and γ_{2} (k_{h j}, d_{h j})$ are two continuous functions of $k_{h j} and d_{h j} .$

2.2.2. Generalized Huang and Yang multinomial logistic estimator

Huang and Yang (Citation2014) proposed a two-parameter biased estimator to remedy multicollinearity in the negative binomial regression model. Their estimator was considered as a general estimator since this estimator is included the MLE, RR estimator, and Liu estimator as special cases. Also, they proved the superiority of this estimator over other biased estimators ( ${\hat{β}}_{MLE}, {\hat{β}}_{RRE} and {\hat{β}}_{LE}$ ).

In this article, we extend and generalized Huang and Yang (Citation2014) estimator to combat multicollinearity in the MNLR model. The proposed estimator is defined as: (11) ${\hat{β}}_{GHY - M} = {(X^{'} \hat{W} X + I)}^{- 1} (X^{'} \hat{W} X + D) {(X^{'} \hat{W} X + K)}^{- 1} (X^{'} \hat{W} X) {\hat{β}}_{MLE} .$ (11) where $K = diag {k_{h j}}; k_{h j} \geq 0,$ and $D = diag {d_{h j}}; {0 \leq d}_{h j} < 1, j = 1, 2, \dots, p^{*}, h = 1, 2, ., C - 1 .$ The variance-covariance matrix of ${\hat{β}}_{GHY - M}$ is defined as $Cov ({\hat{β}}_{GHY - M}) = {(X^{'} \hat{W} X + I)}^{- 1} (X^{'} \hat{W} X + D) {(X^{'} \hat{W} X + K)}^{- 1} {X^{'} \hat{W} X (X^{'} \hat{W} X + K)}^{- 1} (X^{'} \hat{W} X + D) {(X^{'} \hat{W} X + I)}^{- 1} .$

Lemma 2.

The proposed biased estimator in (11), represents a general case, as it is easy to see that

$\lim_{D \to I} {\hat{β}}_{GHY - M} \to {(X' \hat{W} X + K)}^{- 1} (X' \hat{W} X) {\hat{β}}_{MLE} = {\hat{β}}_{GRE} .$
$\lim_{K \to O} {\hat{β}}_{GHY - M} \to {(X^{'} \hat{W} X + I)}^{- 1} (X^{'} \hat{W} X + D) {\hat{β}}_{MLE} = {\hat{β}}_{GLTE} .$
$\lim_{D \to I, K \to O} {\hat{β}}_{GHY - M} \to {\hat{β}}_{MLE} .$

Rewriting EquationEquation (11)(11) ${\hat{β}}_{GHY - M} = {(X^{'} \hat{W} X + I)}^{- 1} (X^{'} \hat{W} X + D) {(X^{'} \hat{W} X + K)}^{- 1} (X^{'} \hat{W} X) {\hat{β}}_{MLE} .$ (11) using the eigenvalues of $(X' \hat{W} X)$ and its corresponding eigenvectors, the proposed biased estimator can be written as: (12) ${\hat{α}}_{GHY - M} = {(Λ + I)}^{- 1} (Λ + D) {(Λ + K)}^{- 1} Λ {\hat{α}}_{MLE} .$ (12)

Note that, ${\hat{β}}_{GHY - M} = Ψ^{'} {\hat{α}}_{GHY - M}$ . The scalar MSE of ${\hat{β}}_{GHY - M}$ is computed as: (13) $MSE ({\hat{β}}_{GHY - M}) = \sum_{j = 1}^{p^{*}} \sum_{h = 1}^{C - 1} \frac{{(λ_{h j} + d_{j})}^{2} λ_{h j}}{{(λ_{h j} + k_{j})}^{2} {(λ_{h j} + 1)}^{2}} + \sum_{j = 1}^{p^{*}} \sum_{h = 1}^{C - 1} \frac{α_{h j}^{2} {[λ_{h j} (k_{j} - d_{j} + 1) + k_{j}]}^{2}}{{(λ_{h j} + k_{j})}^{2} {(λ_{h j} + 1)}^{2}} .$ (13)

2.3. Matrix MSE properties

The matrix MSE (MMSE) of an estimator $\tilde{β}$ of the parameter vector $β$ is defined as $MMSE (\tilde{β}) = E (\tilde{β} - β) {(\tilde{β} - β)}^{'} = Var (\tilde{β}) + Bias (\tilde{β}) Bias {(\tilde{β})}^{'},$ where $Var (\tilde{β})$ represents the covariance matrix of $\tilde{β}$ and $Bias (\tilde{β})$ indicates the bias which equals to $E (\tilde{β}) - β .$ Let ${\tilde{β}}_{1}$ and ${\tilde{β}}_{2}$ be the two estimators of $β,$ the estimator ${\tilde{β}}_{1}$ is said to be superior to the estimator ${\tilde{β}}_{2}$ in the sense of MMSE criterion, if and only if $Δ ({\tilde{β}}_{1}, {\tilde{β}}_{2}) = MMSE ({\tilde{β}}_{1}) - MMSE ({\tilde{β}}_{2}) \geq 0 .$

The scalar MSE is another criterion to gauge the goodness of an estimator and it is defined as $MSE (\tilde{β}) = tr [Cov (\tilde{β})] + Bias {(\tilde{β})}^{'} Bias (\tilde{β}) .$

If $Δ ({\tilde{β}}_{1}, {\tilde{β}}_{2}) \geq 0,$ then $δ ({\tilde{β}}_{1}, {\tilde{β}}_{2}) = MSE ({\tilde{β}}_{1}) - MSE ({\tilde{β}}_{2}) \geq 0 .$ The converse in scalar MSE is not true (Rao et al., 2008), therefore, we consider MMSE criterion to gauge the goodness of proposed estimators. The following two Lemmas 3 and4 are defined to demonstrate the MMSE properties of the estimators.

Lemma 3

(Farebrother Citation1976). Let $Α (Α > 0)$ be a positive-definite matrix, $β$ be a vector of nonzero constants, then $A - β β^{'} \geq 0$ if and only if $β^{'} A^{- 1} β \leq 1 .$

Proof.

See Farebrother (Citation1976).□

Lemma 4.

Let ${\hat{β}}_{j} = F_{j} y,$ $j = 1, 2$ be two competing estimators of $β$ . Suppose $Δ = Cov ({\hat{β}}_{1}) - Cov ({\hat{β}}_{2}) > 0$ , where $Cov ({\hat{β}}_{j}),$ $j = 1, 2$ denotes the covariance matrix of ${\hat{β}}_{j}$ . Then $Δ ({\hat{β}}_{1}, {\hat{β}}_{2}) = MMSE ({\hat{β}}_{1}) - MMSE ({\hat{β}}_{2}) \geq 0 \Leftrightarrow b_{2}^{'} {(Δ + b_{1} b_{1}^{'})}^{- 1} b_{2} \leq 1$ , where $b_{j}$ denote the bias vector of ${\hat{β}}_{j},$ $j = 1, 2 .$

Proof.

See Trenkler and Toutenburg (Citation1990). □

2.3.1. Comparison between ${\hat{β}}_{GHY - M}$ and ${\hat{β}}_{MLE}$

The MMSE of ${\hat{β}}_{GHY - M}$ is computed as (14) $MMSE ({\hat{β}}_{GHY - M}) = [T (X^{'} \hat{W} X) {(T)}^{'}] + [{T (X^{'} \hat{W} X) - I} β β^{'} {T (X^{'} \hat{W} X) - I}^{'}],$ (14) where $T = {(X^{'} \hat{W} X + K)}^{- 1} (X^{'} \hat{W} X + D) {(X^{'} \hat{W} X + I)}^{- 1} .$

To compere ${\hat{β}}_{MLE}$ with ${\hat{β}}_{GHY - M}$ in the sense of MMSE, we compute the MMSE difference as: $Δ_{1} = Δ ({\hat{β}}_{MLE}, {\hat{β}}_{GHY - M}) = MMSE ({\hat{β}}_{MLE}) - MMSE ({\hat{β}}_{GHY - M})$ $= [{(X^{'} \hat{W} X)}^{- 1} - T (X^{'} \hat{W} X) {(T)}^{'}] - [{T (X^{'} \hat{W} X) - I} β β^{'} {T (X^{'} \hat{W} X) - I}^{'}] .$

Theorem 1.

Let $K = diag (k_{1}, k_{2}, \dots, k_{p}), k_{j} \geq 0; j = 1, 2, \dots, p$ and $D = diag (d_{1}, d_{2}, \dots, d_{p}), {0 \leq d}_{j} < 1$ under MNLR model with correlated regressors, the ${\hat{β}}_{GHY - M}$ is superior to ${\hat{β}}_{MLE}$ in the sense of MMSE, namely $MMSE ({\hat{β}}_{MLE}) - MMSE ({\hat{β}}_{GHY - M}) > 0$ if and only if $β^{'} {T (X^{'} \hat{W} X) - I}^{'} {[{(X^{'} \hat{W} X)}^{- 1} - T (X^{'} \hat{W} X) {(T)}^{'}]}^{- 1} {T (X^{'} \hat{W} X) - I} β < 1 .$

Proof.

The difference between $Cov ({\hat{β}}_{MLE})$ and $Cov ({\hat{β}}_{GYC})$ can be written as $Cov ({\hat{β}}_{MLE}) - Cov ({\hat{β}}_{GHY - M}) = {(X^{'} \hat{W} X)}^{- 1} - T (X^{'} \hat{W} X) {(T)}^{'} = diag {[\frac{1}{λ_{h j}} - \frac{{(λ_{h j} + d_{j})}^{2} λ_{h j}}{{(λ_{h j} + k_{j})}^{2} {(λ_{h j} + 1)}^{2}}]}_{j = 1; h = 1}^{p *; C - 1} .$

As observes the performance of ${\hat{β}}_{MLE}$ and ${\hat{β}}_{GHY - M},$ we can see that ${(X^{'} \hat{W} X)}^{- 1} - T (X^{'} \hat{W} X) {(T)}^{'}$ will be positive-definite matrix iff

${(λ_{h j} + k_{j})}^{2} {(λ_{h j} + 1)}^{2} > {(λ_{h j} + d_{j})}^{2} λ_{h j}^{2}$ or $(λ_{h j} + k_{j}) (λ_{h j} + 1) > (λ_{h j} + d_{j}) λ_{h j}$ for $D = diag (d_{1}, d_{2}, \dots, d_{p}), {0 \leq d}_{j} < 1$ and $K = diag (k_{1}, k_{2}, \dots, k_{p}), k_{j} \geq 0; j = 1, 2, \dots, p .$ Simplifying the above inequality one can find that $(λ_{h j} + k_{j}) (λ_{h j} + 1) - (λ_{h j} + d_{j}) λ_{h j} = λ_{h j} + k_{j} + k_{j} λ_{h j} + λ_{h j} + d_{j} > 0 .$ Therefore, it can be concluded that one of the proposed GTP estimators $({\hat{β}}_{GHY - M})$ has a smaller variance-covariance matrix than the existing MLE for the MNLR model with correlated regressors. By using Lemma 3, the proof is completed. □

2.3.2. Comparison between ${\hat{β}}_{GLT-M}$ and ${\hat{β}}_{MLE}$

The MMSE of ${\hat{β}}_{GLT - M}$ is computed as (15) $MMSE ({\hat{β}}_{G LT - M}) = [{(X^{'} \hat{W} X + K)}^{- 1} (X^{'} \hat{W} X + K D) {(X^{'} \hat{W} X)}^{- 1} {{(X^{'} \hat{W} X + K D) (X^{'} \hat{W} X + K)}^{- 1}}^{'}] + [K^{2} {(D - 1)}^{2} {(X^{'} \hat{W} X + K)}^{- 1} β β^{'} {{(X^{'} \hat{W} X + K)}^{- 1}}^{'}] .$ (15)

To compere ${\hat{β}}_{MLE}$ with ${\hat{β}}_{GLT - M}$ in the sense of MMSE, we compute the MMSE difference as: $Δ_{2} = Δ ({\hat{β}}_{MLE}, {\hat{β}}_{GLT - M}) = MMSE ({\hat{β}}_{MLE}) - MMSE ({\hat{β}}_{GLT - M})$ (16) $= [{(X^{'} \hat{W} X)}^{- 1} - {(X^{'} \hat{W} X + K)}^{- 1} (X^{'} \hat{W} X + K D) {(X^{'} \hat{W} X)}^{- 1} {{(X^{'} \hat{W} X + K D) (X^{'} \hat{W} X + K)}^{- 1}}^{'}] - [K^{2} {(D - 1)}^{2} {(X^{'} \hat{W} X + K)}^{- 1} β β^{'} {{(X^{'} \hat{W} X + K)}^{- 1}}^{'}] .$ (16)

Theorem 2.

Let $K = diag (k_{1}, k_{2}, \dots, k_{p}), k_{j} \geq 0; j = 1, 2, \dots, p$ and $D = diag (d_{1}, d_{2}, \dots, d_{p}), {0 \leq d}_{j} < 1$ under MNLR model with correlated regressors, the ${\hat{β}}_{GLT - M}$ is superior to ${\hat{β}}_{MLE}$ in the sense of MMSE, namely $MMSE ({\hat{β}}_{MLE}) - MMSE ({\hat{β}}_{GLT - M}) > 0$ iff $B^{'} {[{(X^{'} \hat{W} X)}^{- 1} - {(X^{'} \hat{W} X + K)}^{- 1} (X^{'} \hat{W} X + K D) {(X^{'} \hat{W} X)}^{- 1} {{(X^{'} \hat{W} X + K D) (X^{'} \hat{W} X + K)}^{- 1}}^{'}]}^{- 1} B < 1$ , where $B = K (D - 1) {(X^{'} \hat{W} X + K)}^{- 1} β .$

Proof.

As observing the comparison of ${\hat{β}}_{MLE}$ and ${\hat{β}}_{GLT - M}$ by variance–covariance matrices as $Cov ({\hat{β}}_{MLE}) - Cov ({\hat{β}}_{GLT - M})$ $= {(X^{'} \hat{W} X)}^{- 1} - {(X^{'} \hat{W} X + K)}^{- 1} (X^{'} \hat{W} X + K D) {(X^{'} \hat{W} X)}^{- 1} {{(X^{'} \hat{W} X + K D) (X^{'} \hat{W} X + K)}^{- 1}}^{'}$ $= diag {[\frac{1}{λ_{h j}} - \frac{{(λ_{h j} + k_{j} d_{j})}^{2}}{{(λ_{h j} + k_{j})}^{2} λ_{h j}}]}_{j = 1; h = 1}^{p + 1; C - 1} .$ We can see that the matrix $Cov ({\hat{β}}_{MLE}) - Cov ({\hat{β}}_{GLT - M})$ will be positive definite iff

${(λ_{h j} + k_{j})}^{2} > {(λ_{h j} + {k_{j} d}_{j})}^{2} or (λ_{h j} + k_{j}) > (λ_{h j} + {k_{j} d}_{j})$ for $D = diag (d_{1}, d_{2}, \dots, d_{p}), {0 \leq d}_{j} < 1$ and $K = diag (k_{1}, k_{2}, \dots, k_{p}), k_{j} \geq 0; j = 1, 2, \dots, p .$ Simplifying the above inequality one can find that ${(λ_{h j} + k_{j})}^{2} - {(λ_{h j} + {k_{j} d}_{j})}^{2} = k_{j} (1 - d_{j}) {2 λ_{h j} + k_{j} + {k_{j} d}_{j}} > 0 .$ The variance-covariance matrix of ${\hat{β}}_{GLT - M}$ has smaller value than the ${\hat{β}}_{MLE}$ iff $d_{j} ({0 \leq d}_{j} < 1)$ and $k_{j} \geq 0,$ then the proof is completed by employing Lemma 3.

3. An algorithm for selection of shrinkage parameters

There are several ways to estimate the parameters k and d are available in literature. However, we propose two methods: first using of the optimal $d_{h j},$ second using the optimal $k_{h j} .$ The following lemmas are present the optimal shrinkage parameters of ${\hat{β}}_{GLT - M}$ and ${\hat{β}}_{GHY - M} .$

Lemma 5.

Optimal $k_{h j}^{*}$

The optimal shrinkage parameter $k_{hj}$ , for minimizing $MSE ({\hat{β}}_{GLT - M}) \forall 0 \leq d_{h j} < 1$ , is given by $k_{h j}^{*} = \frac{λ_{h j}}{λ_{h j} α_{h j}^{2} (1 - d_{h j}) - d_{h j}};$

Since $k_{h j}^{*} > 0,$ then $d_{h j} < \frac{λ_{h j} α_{h j}^{2}}{1 + λ_{h j} α_{h j}^{2}} .$

The optimal shrinkage parameter $k_{hj}$ , for minimizing $MSE ({\hat{β}}_{GHY - M}) \forall 0 \leq d_{h j} < 1$ , is given by $k_{h j (GHY - M)}^{*} = \frac{(λ_{h j} + d_{h j}) - λ_{h j} α_{h j}^{2} (1 - d_{h j})}{λ_{h j} α_{h j}^{2} + α_{h j}^{2}} .$

Lemma 6.

Optimal $d_{h j}^{*}$

The optimal shrinkage parameter $d_{h j}$ for ${\hat{β}}_{GLT - M}$ and ${\hat{β}}_{GHY - M}$ , for all $k_{h j} > 0$ , are given by $d_{h j (GLT - M)}^{*} = \frac{α_{h j}^{2} - \frac{1}{k_{h j}}}{{(\frac{1}{λ_{h j}} + α_{h j}^{2})}^{2}}; d_{h j (GHY - M)}^{*} = \frac{α_{h j}^{2} k_{h j} - λ_{h j} (1 - α_{h j}^{2} k_{h j} - α_{h j}^{2})}{1 + {α_{h j}^{2} λ}_{h j}} .$

For GLT-M estimator, we suggest two methods to select the shrinkage parameters: in the first, we use $d_{h j (GLT - M)}^{*}$ with different proposed estimation of $k_{h j}$ based on the work of Kibria (Citation2003):

${k 1}_{h j} = \frac{1}{{\hat{α}}_{h j}^{2}};$ ${k 2}_{h j} = \frac{\max {(λ}_{h j})}{{\hat{α}}_{h j}^{2}};$ ${k 3}_{h j} = \frac{\max {(λ}_{h j})}{λ_{h j} {\hat{α}}_{h j}^{2}};$ ${k 4}_{h j} = \frac{\max {(λ}_{h j}) \max ({\hat{α}}_{h j}^{2})}{λ_{h j} {\hat{α}}_{h j}^{2}}$ where ${\hat{α}}_{h j} = Ψ' {\hat{β}}_{MLE} .$

While in the second proposed method for selecting shrinkage parameters of GLT-M estimator, we use ${\hat{d}}_{h j} = \frac{1}{2} (\frac{λ_{h j} {\hat{α}}_{h j}^{2}}{1 + λ_{h j} {\hat{α}}_{h j}^{2}}),$ that satisfy the condition of $d_{h j}$ in Lemma 5 with one of the following suggested formulas of $k_{hj} :$

${k 5}_{h j} = \frac{λ_{h j}}{λ_{h j} {\hat{α}}_{h j}^{2} (1 - {\hat{d}}_{h j}) - {\hat{d}}_{h j}}; {k 6}_{h j} = \frac{\max {(λ}_{h j})}{\min {(λ}_{h j})} {k 5}_{h j}; {k 7}_{h j} = \max {(λ}_{h j}) {k 5}_{h j}$

Note that, ${k 1}_{h j}$ is the original estimation of the ridge parameter proposed by Hoerl and Kennard (Citation1970a), while other estimations ( ${k 2}_{h j},$ …, ${k 7}_{h j}$ ) are completely new for Liu- type estimator in the multinomial logistic regression model. For GHY-M estimator, we use Huang and Yang (Citation2014) algorithm for selecting shrinkage parameters but based on the generalized parameters ( $d_{h j (GHY - M)}^{*} and k_{h j (GHY - M)}^{*}$ ).

4. Monte Carlo simulation

A Monte Carlo simulation study has been conducted to compare the performances of MLE, GLT-M (based on ${k 1}_{h j},$ …, ${k 7}_{h j}$ ), and GHY-M estimators. Monte Carlo experiments are carried out based on the response variable $y_{i}$ obtained by using the multinomial distribution $π_{h} (x_{i}) = \frac{e^{\sum_{j = 1}^{p} β_{h j} x_{i j}}}{1 + \sum_{h = 1}^{C - 1} e^{\sum_{j = 1}^{p} β_{h j} x_{i j}}}; i = 1, 2, \dots, n; h = 1, 2, \dots, C - 1,$ where parameters $β_{h j}$ are chosen to be $β' β = 1,$ which is a commonly used restriction in many simulation studies in the field. See for example, Kibria (Citation2003. In this simulation study, we assume that the probability of each level of the response variable is equal. Following the work of Kibria (Citation2003) and Månsson, Kibria, and Shukur (Citation2012), the explanatory variables were generated as $x_{i j} = ω_{i j} \sqrt{1 - ρ^{2}} + ρ ω_{i p + 1}; ω_{i j} \sim N (0, 1) .$

The effective factors are chosen to be the number of explanatory variables (p = 3, 5, and 7), the sample size (n = 50, 100, 200, 300, 400, and 500), the correlation among the explanatory variables ( $ρ =$ 0.90, 0.95, and 0.99), and the levels of the response variable (C = 5, 7, and 9). In our study, the simulated MSE (SMSE) is used as the criterion of judgment, it is computed by using the following equation $SMSE (\hat{β}) = \frac{\sum_{l = 1}^{1000} {({\hat{β}}_{l} - β)}^{'} ({\hat{β}}_{l} - β)}{1000},$ where ${\hat{β}}_{l}$ is the vector of estimated values at l^th experiment of the simulation, while $β$ is the vector of true parameters. The program of the Monte Carlo simulation study is written in R language.

Monte Carlo simulation results are given in and and . Specifically, present SMSE values of the three estimators in the case of a number of explanatory variables ( $p = 3, 5, 7$ ) and with the levels of the response variable $C = 5 .$ While in the cases of $C = 7$ with the same values of $p$ is presented in . Similarly, presents SMSE values of the estimators in the case of $C = 9$ with the same values of $p .$

Figure 1. Relative efficiency of the estimators when C = 7.

Figure 2. Relative efficiency of the estimators when C = 9.

For all simulation situations, it can be noted that SMSE values of GLT-M and GHY-M estimators are less than SMSE values of MLE. This means that theses estimators have better performance than MLE for different cases of $n, ρ, p,$ and $C .$ Moreover, all the estimators have monotonic behaviors according to SMSE values. Namely, when the sample size $n$ increases, the estimated SMSE values decrease. It is obvious from tables and figures that by increasing the sample size affect positively on the performance of all estimators (including MLE). Also, it can be noted that, when $p$ and $ρ$ are fixed, increasing $C$ causes an increase in SMSE values of all estimators without exception. This increase is much larger in MLE than other estimators. Furthermore, when $C$ and $p$ are fixed, increasing $ρ$ affects SMSE values of all estimators negatively, especially MLE. In other words, this increase is much larger in MLE than other estimators. Also, increasing $p$ and $ρ$ with small $n$ inflates SMSE values of all estimators.

As expected, in the case of high multicollinearity, GLT-M and GHY-M estimators showed its best performance by means of the reduction of SMSE values and it is not affected by multicollinearity. But we note that GLT-M estimator for ${k 1}_{h j},$ …, ${k 7}_{h j}$ is better than GHY-M estimator in all simulation situations. And there is some difference between the performances of GLT-M estimators according to the shrinkage parameter $k$ that is used. According to our simulation study, it may be concluded that $K 3$ is the best shrinkage parameter among others in most simulation situations.

5. Application

To illustrate the empirical relevance of the proposed estimators, we analyze Swedish football data in this empirical section. The proposed and existing estimators are elucidated using a dataset regarding the performance of Swedish football teams in the top Swedish league (Allsvenskan) during the year of 2018.Footnote¹

This dataset includes 242 observations and include one dependent variable (Y) is the full time results (H = Home win, D = Draw, A = Away win) of the football team, and nine explanatory variables, which are the pinnacle home win odds (PH), pinnacle draw odds (PD), pinnacle away win odds (PA), maximum Oddsportal home win odds (MaxH), maximum Oddsportal draw win odds (MaxD), maximum Oddsportal away win odds (MaxA), AvgH = average Oddsportal home win odds (AvgH), average Oddsportal draw win odds (AvgD), and average Oddsportal away win odds (AvgA). The effect of these regressors on Y, respectively are demonstrated by the A MNLR analysis.

The variance inflation factor (VIF) values of all explanatory variables are given in . From , it appears that the model is suffering from the multicollinearity problem as all VIFs are greater than 10. Also, the values of correlation coefficients between the explanatory variables are greater than 0.85 (in most of the cases). presents estimates and standard errors (SE) values of MLE, GLT-M and GHY-M estimators. indicates that GLT-M and GHY-M estimates have smaller SE values than MLE estimates. This means that GLT-M and GHY-M are efficient than MLE. These results are consistent with the simulation results in Sec. 4.

Table 4. Person correlation matrix of explanatory variables and VIF.

Download CSV Display Table

Table 5. Parameter estimates and standard errors of MNLR model.

Download CSV Display Table

6. Some concluding remarks

To overcome the problem of multicollinearity, this paper proposes two generalized biased estimators for the multinomial logistic regression model by following the work of Abonazel and Farghali (Citation2019), Farghali (Citation2019), and Huang and Yang (Citation2014). We discuss the MSE properties of the estimators and developed algorithms to estimate the biasing or shrinkage parameters $k_{h j}$ and $d_{h j} .$ A simulation study has been conducted to compare the performance of the estimators and to support the theoretical comparison. Simulation results indicated that increasing the correlation between the independent variables has a negative effect on the MSE, whereas, increasing the number of regressors and amount of correlation have a positive effect on MSE. When the sample size increases, the MSE values of the estimators decrease even when the correlation between regressors is large. It also appeared that proposed GLT-M performed better than both MLE and GHY-M estimator and GHY-M is doing better than MLE. Overall, the estimator based on K3 performed the best followed by K2, K1, and K4. For illustration purposes, Swedish football data are analyzed, which supported the simulation results and consistent with theoretical results of the paper. Finally, we recommend the researchers use the GLT-M estimator with parameter K3.

Notes

1 The data are publicly available on the webpage www.football-data.co.uk.

References

Abonazel, M. R., and R. A. Farghali. 2019. Liu-type multinomial logistic estimator. Sankhya B 81 (2):203–25. doi: 10.1007/s13571-018-0171-4.
Google Scholar
Agresti, A. 2013. Categorical data analysis. 3rd ed. Hoboken, NJ: John Wiley& Sons, Inc.
Google Scholar
Akram, M. N., M. Amin, and M. Qasim. 2020. A new Liu-type estimator for the inverse Gaussian regression model. Journal of Statistical Computation and Simulation 90 (7):1153–72. doi: 10.1080/00949655.2020.1718150.
Web of Science ®Google Scholar
Amin, M., M. Qasim, A. Yasin, and M. Amanullah. 2020a. Almost unbiased ridge estimator in the gamma regression model. Communications in Statistics-Simulation and Computation. Advance online publication. doi:10.1080/03610918.2020.1722837.
PubMed Web of Science ®Google Scholar
Amin, M., M. Qasim, and M. Amanullah. 2019. Performance of Asar and Genç and Huang and Yang’s two-parameter estimation methods for the Gamma regression model. Iranian Journal of Science and Technology, Transactions A: Science 43 (6):2951–63. doi: 10.1007/s40995-019-00777-3.
Web of Science ®Google Scholar
Amin, M., M. Qasim, M. Amanullah, and S. Afzal. 2020b. Performance of some ridge estimators for the gamma regression model. Statistical Papers 61 (3):997–1026. doi: 10.1007/s00362-017-0971-z.
Web of Science ®Google Scholar
Asar, Y., and A. Genç. 2018. A new two-parameter estimator for the Poisson regression model. Iranian Journal of Science and Technology, Transactions A: Science 42 (2):793–803. doi: 10.1007/s40995-017-0174-4.
Web of Science ®Google Scholar
Asl, M. N., H. Bevrani, R. A. Belaghi, and K. Mansson. 2021. Ridge-type shrinkage estimators in generalized linear models with an application to prostate cancer data. Statistical Papers 62 (2):1043–85. doi: 10.1007/s00362-019-01123-w.
Web of Science ®Google Scholar
El-Dash, A., A. El-Hefnawy, and R. Farghali. 2011. Goal programming technique for correcting multicollinearity problem in multinomial logistic regression. The 46th Annual Conference on Statistics, Computer Sciences, and Operation Research, 26–29 Dec., 72–87.
Google Scholar
Farebrother, R. W. 1976. Further results on the mean square error of ridge regression. Journal of the Royal Statistical Society: Series B (Methodological) 38 (3):248–50. doi: 10.1111/j.2517-6161.1976.tb01588.x.
Web of Science ®Google Scholar
Farghali, R. 2014. A suggested biased estimator for correcting multicollinearity in multinomial logistic regression. Egyptian Statistical Journal 58:183–97.
Google Scholar
Farghali, R. 2019. Generalized Liu-Type estimator for linear regression. International Journal of Research and Reviews and in Applied Sciences 38:52–63.
Google Scholar
Hoerl, A. E., and R. W. Kennard. 1970. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12 (1):55–67. doi: 10.1080/00401706.1970.10488634.
Web of Science ®Google Scholar
Hosmer, D., S. Lemeshow, and R. Sturdivant. 2013. Applied logistic regression. 3rd ed. New York: John Wiley& Sons, Inc.
Google Scholar
Huang, J., and H. Yang. 2014. A two-parameter estimator in the negative binomial regression model. Journal of Statistical Computation and Simulation 84 (1):124–34. doi: 10.1080/00949655.2012.696648.
Web of Science ®Google Scholar
Karlsson, P., K. Månsson, and B. G. Kibria. 2020. A Liu estimator for the beta regression model and its application to chemical data. Journal of Chemometrics 34 (10):e3300. doi: 10.1002/cem.3300.
Web of Science ®Google Scholar
Kejian, L. 1993. A new class of blased estimate in linear regression. Communications in Statistics - Theory and Methods 22 (2):393–402. doi: 10.1080/03610929308831027.
Web of Science ®Google Scholar
Kibria, B. M. G. 2003. Performance of some new ridge regression estimators. Communications in Statistics - Simulation and Computation 32 (2):419–35. doi: 10.1081/SAC-120017499.
Web of Science ®Google Scholar
Kurtoğlu, F., and M. R. Özkale. 2016. Liu estimation in generalized linear models: Application on gamma distributed response variable. Statistical Papers 57 (4):911–28. doi: 10.1007/s00362-016-0814-3.
Web of Science ®Google Scholar
Luce, R. D. 1959. Individual choice behaviour: A theoretical analysis. New York: Wiley.
Google Scholar
Lukman, A. F., B. Aladeitan, K. Ayinde, and M. R. Abonazel. 2021. Modified ridge-type for the Poisson regression model: Simulation and application. Journal of Applied Statistics. Advance online publication. doi:10.1080/02664763.2021.1889998.
PubMed Web of Science ®Google Scholar
Lukman, A. F., K. Ayinde, B. G. Kibria, and E. T. Adewuyi. 2020. Modified ridge-type estimator for the gamma regression model. Communications in Statistics-Simulation and Computation. Advance online publication. doi:10.1080/03610918.2020.1752720.
Web of Science ®Google Scholar
Mandal, S., R. Arabi Belaghi, A. Mahmoudi, and M. Aminnejad. 2019. Stein‐type shrinkage estimators in gamma regression model with application to prostate cancer data. Statistics in Medicine 38 (22):4310–22. doi: 10.1002/sim.8297.
PubMed Web of Science ®Google Scholar
Månsson, K. 2012. On ridge estimators for the negative binomial regression model. Economic Modelling 29 (2):178–84. doi: 10.1016/j.econmod.2011.09.009.
Web of Science ®Google Scholar
Månsson, K. 2013. Developing a Liu estimator for the negative binomial regression model: Method and application. Journal of Statistical Computation and Simulation 83 (9):1773–80. doi: 10.1080/00949655.2012.673127.
Web of Science ®Google Scholar
Månsson, K., and G. Shukur. 2011a. On ridge parameters in logistic regression. Communications in Statistics – Theory and Methods 40 (18):3366–81. doi: 10.1080/03610926.2010.500111.
Web of Science ®Google Scholar
Månsson, K., and G. Shukur. 2011b. A Poisson ridge regression estimator. Economic Modelling 28 (4):1475–81. doi: 10.1016/j.econmod.2011.02.030.
Web of Science ®Google Scholar
Månsson, K., B. M. G. Kibria, and G. Shukur. 2012. On Liu estimators for the logit regression model. Economic Modelling 29 (4):1483–8. doi: 10.1016/j.econmod.2011.11.015.
Web of Science ®Google Scholar
Månsson, K., B. M. G. Shukur, and B. G. Kibria. 2018. Performance of some ridge regression estimators for the multinomial logit model. Communications in Statistics - Theory and Methods 47 (12):2795–804. doi: 10.1080/03610926.2013.784996.
Web of Science ®Google Scholar
Månsson, K., G. Shukur, and B. M. G. Kibria. 2018. Performance of some ridge regression estimators for the multinomial logit model. Communications in Statistics - Theory and Methods 47 (12):2795–804. doi: 10.1080/03610926.2013.784996.
Web of Science ®Google Scholar
Naveed, K., M. Amin, S. Afzal, and M. Qasim. 2020. New shrinkage parameters for the inverse Gaussian Liu regression. Communications in Statistics – Theory and Methods. Advance online publication. doi:10.1080/03610926.2020.1791339.
Web of Science ®Google Scholar
Noori Asl, M., H. Bevrani, and R. Arabi Belaghi. 2020. Penalized and ridge-type shrinkage estimators in Poisson regression model. Communications in Statistics – Simulation and Computation. Advance online publication. doi:10.1080/03610918.2020.1730402.
Web of Science ®Google Scholar
Özkale, M. R., and S. Kaciranlar. 2007. The restricted and unrestricted two-parameter estimators. Communications in Statistics – Theory and Methods —Theory and Methods, 36 (15):2707–25. doi: 10.1080/03610920701386877.
Web of Science ®Google Scholar
Qasim, M., B. M. G. Kibria, K. Månsson, and P. Sjölander. 2020a. A new Poisson Liu regression estimator: Method and application. Journal of Applied Statistics 47 (12):2258–71. doi: 10.1080/02664763.2019.1707485.
PubMed Web of Science ®Google Scholar
Qasim, M., K. Månsson, and B. M. Golam Kibria. 2021. On some beta ridge regression estimators: Method, simulation and application. Journal of Statistical Computation and Simulation. Advance online publication. doi:10.1080/00949655.2020.1867549.
Web of Science ®Google Scholar
Qasim, M., K. Månsson, M. Amin, B. M. G. Kibria, and P. Sjölander. 2020b. Biased adjusted Poisson ridge estimators-method and application. Iranian Journal of Science and Technology, Transactions A: Science 44 (6):1775–89. doi: 10.1007/s40995-020-00974-5.
PubMed Web of Science ®Google Scholar
Qasim, M., M. Amin, and M. Amanullah. 2018. On the performance of some new Liu parameters for the gamma regression model. Journal of Statistical Computation and Simulation 88 (16):3065–80. doi: 10.1080/00949655.2018.1498502.
Web of Science ®Google Scholar
Rady, E. A., M. R. Abonazel, and I. M. Taha. 2019. New shrinkage parameters for liu-type zero inflated negative binomial estimator. The 54th Annual Conference on Statistics, Computer Science, and Operation Research 3-5 Dec, 2019. FGSSR, Cairo University.
Google Scholar
Schaefer, R. 1986. Alternative estimators in logistic regression when the data are collinear. Journal of Statistical Computation and Simulation 25 (1/2):75–91. doi: 10.1080/00949658608810925.
Google Scholar
Schaefer, R., L. Roi, and R. Wolfe. 1984. A ridge logistic estimator. Communications in Statistics - Theory and Methods 13 (1):99–113. doi: 10.1080/03610928408828664.
Web of Science ®Google Scholar
Segerstedt, B. 1992. On ordinary ridge regression in generalized linear models. Communications in Statistics - Theory and Methods 21 (8):2227–46. doi: 10.1080/03610929208830909.
Web of Science ®Google Scholar
Toker, S., G. Üstündağ Şiray, and M. Qasim. 2019. Developing a first order two parameter estimator for generalized linear model. Paper presented at the 11th International Statistics Congress; Muğla, Turkey.
Google Scholar
Trenkler, G., and H. Toutenburg. 1990. Mean squared error matrix comparisons between biased estimators—an overview of recent results. Statistical Papers 31 (1):165–79. doi: 10.1007/BF02924687.
Google Scholar
Yang, H., and X. Chang. 2010. A new two-parameter estimator in linear regression. Communications in Statistics – Theory and Methods 39 (6):923–34. doi: 10.1080/03610920902807911.
Web of Science ®Google Scholar

Generalized two-parameter estimators in the multinomial logit regression model: methods, simulation and application

Abstract

1. Introduction

2. Model and proposed GTP estimators

2.1. Multinomial logit regression

2.2. New generalized two-parameter estimators

2.2.1. Generalized Liu-type multinomial logistic estimator

2.2.2. Generalized Huang and Yang multinomial logistic estimator

2.3. Matrix MSE properties

2.3.1. Comparison between ${\hat{β}}_{GHY - M}$ and ${\hat{β}}_{MLE}$

2.3.2. Comparison between ${\hat{β}}_{GLT-M}$ and ${\hat{β}}_{MLE}$

3. An algorithm for selection of shrinkage parameters

4. Monte Carlo simulation

Table 1. MSE values of the estimators when $p = 3$ and $C = 5 .$

Table 2. MSE values of the estimators when $p = 5$ and $C = 5 .$

Table 3. MSE values of the estimators when $p = 7$ and $C = 5 .$

5. Application

Table 4. Person correlation matrix of explanatory variables and VIF.

Table 5. Parameter estimates and standard errors of MNLR model.

6. Some concluding remarks

References

Information for

Open access

Opportunities

Help and information

Generalized two-parameter estimators in the multinomial logit regression model: methods, simulation and application

Abstract

1. Introduction

2. Model and proposed GTP estimators

2.1. Multinomial logit regression

2.2. New generalized two-parameter estimators

2.2.1. Generalized Liu-type multinomial logistic estimator

2.2.2. Generalized Huang and Yang multinomial logistic estimator

2.3. Matrix MSE properties

2.3.1. Comparison between β̂GHY−M and β̂MLE

2.3.2. Comparison between β̂GLT-M and β̂MLE

3. An algorithm for selection of shrinkage parameters

4. Monte Carlo simulation

Table 1. MSE values of the estimators when p =3 and C =5.

Table 2. MSE values of the estimators when p =5 and C =5.

Table 3. MSE values of the estimators when p =7 and C =5.

5. Application

Table 4. Person correlation matrix of explanatory variables and VIF.

Table 5. Parameter estimates and standard errors of MNLR model.

6. Some concluding remarks

Notes

References

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date

2.3.1. Comparison between ${\hat{β}}_{GHY - M}$ and ${\hat{β}}_{MLE}$

2.3.2. Comparison between ${\hat{β}}_{GLT-M}$ and ${\hat{β}}_{MLE}$

Table 1. MSE values of the estimators when $p = 3$ and $C = 5 .$

Table 2. MSE values of the estimators when $p = 5$ and $C = 5 .$

Table 3. MSE values of the estimators when $p = 7$ and $C = 5 .$