Full article: New ridge estimators in the inverse Gaussian regression: Monte Carlo simulation and application to chemical data

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

In numerous application areas, when the response variable is continuous, positively skewed, and well fitted to the inverse Gaussian distribution, the inverse Gaussian regression model (IGRM) is an effective approach in such scenarios. The problem of multicollinearity is very common in several application areas like chemometrics, biology, finance, and so forth. The effects of multicollinearity can be reduced using the ridge estimator. This research proposes new ridge estimators to address the issue of multicollinearity in the IGRM. The performance of the new estimators is compared with the maximum likelihood estimator and some other existing estimators. The mean square error is used as a performance evaluation criterion. A Monte Carlo simulation study is conducted to assess the performance of the new ridge estimators based on the minimum mean square error criterion. The Monte Carlo simulation results show that the performance of the proposed estimators is better than the available methods. The comparison of proposed ridge estimators is also evaluated using two real chemometrics applications. The results of Monte Carlo simulation and real applications confirmed the superiority of the proposed ridge estimators to other competitor methods.

KEYWORDS:

MATHEMATICS SUBJECT CLASSIFICATION:

CJ07; CJ12

1. Introduction

In real life we often deal with the datasets in which the form of dependent variable is continuous and positively skewed. In such scenario, the inverse Gaussian regression model (IGRM) is more suitable than the linear regression model (LRM). The most common maximum likelihood estimation (MLE) is used to estimate the regression coefficients of the IGRM. The applications of IGRM are mostly observed in the fields of physical sciences, health sciences, chemical sciences, and engineering (Amin, Amanullah, and Aslam Citation2016; Kinat, Amin, and Mahmood Citation2020; Akram, Amin, and Qasim Citation2020; Naveed et al. Citation2020; Amin, Amanullah, and Qasim Citation2020). Multicollinearity is an imperative problem particularly in fields of chemometrics, health sciences, and biostatistics. This problem occurs when the regressors are linearly correlated. In the presence of multicollinearity problem, it is not suitable to estimate the regression coefficients of the IGRM using MLE. The MLE has many drawbacks when the regressors are linearly correlated. The MLE may be unstable with wrong signs of the coefficients and the variances of the coefficients become inflated in the presence of multicollinearity. In addition, the interpretation of the estimated coefficients also becomes difficult (Qasim et al. Citation2019; Amin, Akram, and Amanullah Citation2020). To resolve the issue of multicollinearity, many alternative biased estimation methods are provided in the literature. One way to address the problem of multicollinearity is ridge regression (RR). The method of RR was firstly introduced by Hoerl and Kennard (Citation1970) for the LRM. One benefit of RR is that the mean squared error (MSE) can be reduced using an optimal value of the shrinkage parameter k, with the decrease in variance and increase in bias. Many research studies have been conducted for the LRM which is estimated using ordinary least square and various shrinkage parameters, k have been suggested. For the LRM, some popular studies have been done by Hoerl and Kennard (Citation1970), Hoerl, Kennard, and Baldwin (Citation1975), McDonald and Galarneau (Citation1975), Kibria (Citation2003), Khalaf and Shukur (Citation2005), Muniz and Kibria (Citation2009), and many others.

The literature on ridge estimators for the generalized linear model (GLM) is very limited. However, a few studies have been conducted where different ridge estimators are suggested for some specific cases of the GLM. For instance, Månsson and Shukur (Citation2011) proposed a RR approach in the Poisson regression model. Kibria, Månsson, and Shukur (Citation2015) evaluated the performance of some ridge estimators in the logistic regression model. Furthermore, Månsson (Citation2012) suggested ridge estimators for the negative binomial regression model. Recently, Amin et al. (Citation2020a) and Algamal (Citation2018a) proposed some ridge estimators and recommended best ridge estimator for the gamma RR. Algamal (Citation2018b) presented an efficient estimation algorithm of biasing parameter of the ridge estimator. The working process of kidney in human body is the inspiration behind the proposed algorithm. Some new shrinkage estimators for gamma regression model were suggested by Algamal (Citation2018c). The adjusted biased estimator proposed by Amin et al. (Citation2020b) to overcome the problem of inflated bias in the gamma RR estimator. Algamal (Citation2019) proposed ridge estimator in the IGRM. Shamany, Alobaidi, and Algamal (Citation2019) introduced a two-parameter estimator for the IGRM which was applied to a chemometric dataset. Amin, Qasim, and Amanullah (Citation2019) and Akram, Amin, and Amanullah (Citation2020) proposed two-parameter estimators for the Gamma regression model and IGRM, respectively. Different studies have been proposed related to ridge estimators and recommended a best ridge parameter for different form of the GLM. From the available literature, it is determined that there is a need of study to choose suitable ridge parameter for the inverse Gaussian ridge regression (IGRR).

This article introduces some new estimating methods for choosing best ridge parameter for the IGRR to decrease the MSE and acquire efficient results. In addition, we derived the MSE properties of the IGRR. The performance of the proposed ridge estimators is compared with the existing best ridge estimators for different type of models. The Monte Carlo simulation is used to evaluate the performance of the proposed and existing estimators in sense of MSE. The advantage of the proposed estimators is shown using two different applications where one can see that the newly proposed estimator is an effective alternative to the MLE and other existing biased estimators in the presence of multicollinearity.

The rest of the article is structured as follows. In Sec. 2, we discuss the estimation methods of the IGRM, IGRR, and derive MSE properties. In Sec. 3, the new ridge estimators are defined and modified the existing estimators for the IGRR. Details of the Monte Carlo simulation and simulated results are presented in Sec. 4. The advantage of the proposed estimator is shown by real life applications in Sec. 5. Finally, concluding remarks of the article are given in Sec. 6.

2. Theory and method

2.1. The IGRM

The IGRM is very common in applied research work, when the dependent variable ( $y_{i})$ is continuous, enormously positively skewed and well fitted to the inverse Gaussian (IG) distribution with location parameter, $μ$ and scale parameter $σ^{2},$ denoted by IG $(μ, σ^{2}) .$ The required probability density function is given as(1) $f (y; μ, σ^{2}) = \frac{1}{\sqrt{2 π y^{3} σ^{2}}} exp - {\frac{{(y - μ)}^{2}}{2 μ^{2} y σ^{2}}}; y > 0$ (1) with mean $μ,$ and variance $σ^{2} μ^{3} .$ Suppose $y$ be the dependent variable and distributed as IG $(μ, σ^{2}) .$ In the IGR model $g (μ) = \frac{1}{\sqrt{η}}$ is the link function where $η = X^{t} β$ be the linear predictor of the explanatory variables, where $X^{t} = {(x_{i 1}, \dots, x_{i p})}^{t}$ which is $n \times (p + 1)$ data matrix of the explanatory variables and $β$ is $(p + 1) \times 1$ vector of the regression parameters. In case of the GLM, the IG distribution is a member of exponential family for which the general density is given as(2) $f (y; θ, ϕ) = exp [\frac{y θ - b (θ)}{ϕ} + c (y, ϕ)],$ (2) where $θ$ is the location parameter but not necessarily mean, b( $θ)$ is cumulant function and $ϕ$ is the dispersion parameter. The IG distribution in exponential density form is written as(3) $f (y : θ, ϕ) = exp [- \frac{y}{2 μ^{2} ϕ} + \frac{1}{μ ϕ} - \frac{1}{2 y ϕ} - \frac{1}{2} \ln (2 π y^{3}) - \ln (ϕ)]$ (3)

By equating EquationEqs. (2)(2) $f (y; θ, ϕ) = exp [\frac{y θ - b (θ)}{ϕ} + c (y, ϕ)],$ (2) and Equation(3)(3) $f (y : θ, ϕ) = exp [- \frac{y}{2 μ^{2} ϕ} + \frac{1}{μ ϕ} - \frac{1}{2 y ϕ} - \frac{1}{2} \ln (2 π y^{3}) - \ln (ϕ)]$ (3) , we have $θ = \frac{- 1}{2 μ^{2}}, b (θ) = - \sqrt{- 2 θ}$ and $ϕ = σ^{2} .$ Thus, the mean and variance of the IG density are given in EquationEqs. (4)(4) $E (y) = b^{'} (θ) = (\frac{- 1}{μ^{2}}) (- μ^{3}) = μ = \frac{1}{\sqrt{η}},$ (4) and Equation(5)(5) $V (y) = ϕ b^{″} (θ) = ϕ V (μ) = ϕ diag (μ_{1}^{3}, \dots, μ_{n}^{3}),$ (5) , respectively.(4) $E (y) = b^{'} (θ) = (\frac{- 1}{μ^{2}}) (- μ^{3}) = μ = \frac{1}{\sqrt{η}},$ (4) (5) $V (y) = ϕ b^{″} (θ) = ϕ V (μ) = ϕ diag (μ_{1}^{3}, \dots, μ_{n}^{3}),$ (5) where prime denotes the partial derivative. The final form of log likelihood function of EquationEq. (3)(3) $f (y : θ, ϕ) = exp [- \frac{y}{2 μ^{2} ϕ} + \frac{1}{μ ϕ} - \frac{1}{2 y ϕ} - \frac{1}{2} \ln (2 π y^{3}) - \ln (ϕ)]$ (3) is given in EquationEq. (6)(6) $l_{i} (y_{i}; x_{i}^{t} β, ϕ) = \sum_{i = 1}^{n} {\frac{[\frac{y_{i} x_{i}^{t} β}{2} - \sqrt{x_{i}^{t} β}]}{ϕ} + \frac{1}{(- 2 y_{i} ϕ)} - \frac{1}{2} ln (2 π y_{i}^{3} ϕ)} .$ (6) . $l_{i} (y_{i}; μ_{i}, ϕ) = \sum_{i = 1}^{n} {\frac{[\frac{(y_{i})}{2 μ_{i}^{2}} - \frac{1}{μ_{i}}]}{ϕ} + \frac{1}{(- 2 y_{i} ϕ)} - \frac{1}{2} ln (2 π y_{i}^{3} ϕ)}$ or equivalently(6) $l_{i} (y_{i}; x_{i}^{t} β, ϕ) = \sum_{i = 1}^{n} {\frac{[\frac{y_{i} x_{i}^{t} β}{2} - \sqrt{x_{i}^{t} β}]}{ϕ} + \frac{1}{(- 2 y_{i} ϕ)} - \frac{1}{2} ln (2 π y_{i}^{3} ϕ)} .$ (6)

The MLE of $β$ can be obtained by solving the following equation,(7) $U (β_{j}) = \frac{\partial l_{i}}{\partial β_{j}} = \frac{1}{2 ϕ} (y_{i} - \frac{1}{\sqrt{x_{i}^{t} β_{j}}}) x_{i}^{t} = 0, i = 1, 2, \dots, n, j = 1, 2, \dots, p + 1 .$ (7)

Since the solution of EquationEq. (7)(7) $U (β_{j}) = \frac{\partial l_{i}}{\partial β_{j}} = \frac{1}{2 ϕ} (y_{i} - \frac{1}{\sqrt{x_{i}^{t} β_{j}}}) x_{i}^{t} = 0, i = 1, 2, \dots, n, j = 1, 2, \dots, p + 1 .$ (7) is non-linear, so the Newton-Raphson iterative procedure is used to estimate the unknown parameters. For iterative procedure of the IGRM, initial values and full algorithm for the estimation of unknown parameters can be found in Hardin and Hilbe (Citation2012). Let $β^{(m)}$ be the approximated maximum likelihood value of $β$ at the ${m th}^{}$ iteration with convergence in deviance, the iterative method gives the relation as(8) $β^{(m + 1)} = β^{(m)} + {I (β^{(m)})}^{- 1} U (β^{(m)}),$ (8) where $I (β^{(m)})$ is the fisher information matrix and $U (β^{(m)})$ is the score vector with dimension $(p + 1) \times 1$ and both fisher information matrix and score vector are evaluated at $β^{(m)} .$ At convergence in deviance, the unknown regression parameters can be calculated as(9) ${\hat{β}}_{ML} = {(S)}^{- 1} X^{t} \hat{W} \hat{z},$ (9) where $S = X^{t} \hat{W} X,$ ${\hat{z}}_{i} = {\hat{η}}_{i} + \frac{(y_{i} - {\hat{μ}}_{i})}{{\hat{μ}}_{i}^{3}}$ be the adjusted response variable and $\hat{W} = diag ({\hat{μ}}_{1}^{3}, {\hat{μ}}_{2}^{3} \dots ., {\hat{μ}}_{n}^{3}) .$ Here, ${\hat{μ}}_{i} = \frac{1}{\sqrt{x_{i}^{t} {\hat{β}}_{ML}}}, i = 1, 2, \dots, n .$ and ${\hat{η}}_{i} = x_{i}^{t} {\hat{β}}_{ML} .$ Both $\hat{z}$ and $\hat{W}$ are found by the iterative methods and for the detail derivations, readers are referred to Hardin and Hilbe (Citation2012). Moreover, the MSE of the MLE is given as(10) $E (L_{ML}^{2}) = E {({\hat{β}}_{ML} - β)}^{t} ({\hat{β}}_{ML} - β) = \hat{ϕ} t r {(S)}^{- 1} = \hat{ϕ} \sum_{j = 1}^{p + 1} \frac{1}{λ_{j}},$ (10) where $λ_{j}$ is the jth eigenvalue of the $X^{t} \hat{W} X$ matrix and $\hat{ϕ}$ is the estimated dispersion parameter computed as $\hat{ϕ} = \frac{1}{n - (p + 1)} \sum_{i = 1}^{n} \frac{{(y_{i} - {\hat{μ}}_{i})}^{2}}{V ({\hat{μ}}_{i})} = \frac{1}{n - (p + 1)} \sum_{i = 1}^{n} \frac{{(y_{i} - \frac{1}{\sqrt{x_{i}^{t} {\hat{β}}_{ML}}})}^{2}}{{(\frac{1}{\sqrt{x_{i}^{t} {\hat{β}}_{ML}}})}^{3}} .$ One disadvantage of MLE is that the MSE becomes large when the explanatory variables are linearly correlated since some of the eigenvalues will be decreased. The EquationEq. (10)(10) $E (L_{ML}^{2}) = E {({\hat{β}}_{ML} - β)}^{t} ({\hat{β}}_{ML} - β) = \hat{ϕ} t r {(S)}^{- 1} = \hat{ϕ} \sum_{j = 1}^{p + 1} \frac{1}{λ_{j}},$ (10) undoubtedly shows that the MSE of the MLE becomes overstated due to the problem of multicollinearity.

2.2. The IGRR

The IGRR is an effective alternative to the MLE in the presence of multicollinearity. The MSE of the MLE becomes inflated and results are inefficient when the explanatory variables are correlated. To solve the problem of collinearity among the explanatory variables, Segerstedt (Citation1992) recommended RR method for the GLM based on the work of Hoerl and Kennard (Citation1970). The regression coefficients of the IGRM are estimated by MLE using iterative weighted least square approach. Let $β^{*}$ be an estimate of vector $β .$ Then, the weighted sum of squared error (WSSE) can be written as(11) $WSSE = (y - β^{*})^{t} (y - β^{*}) = (y - X {\hat{β}}_{ML})^{t} (y - X {\hat{β}}_{ML}) + {(β^{*} - {\hat{β}}_{ML})}^{t} S (β^{*} - {\hat{β}}_{ML}),$ (11)

According to the condition ${(β^{*} - {\hat{β}}_{ML})}^{t} S (β^{*} - {\hat{β}}_{ML})$ is the increased of the WSSE when ${\hat{β}}_{ML}$ is substituted by $β^{*} .$ In order to find the IGRR estimator, the length of ${β^{*}}^{t} β^{*}$ should be minimized subject to ${(β^{*} - {\hat{β}}_{ML})}^{t} S (β^{*} - {\hat{β}}_{ML}) = \partial_{0},$ which can be described as a Lagrangian problem(12) $Minimize F = {β^{*}}^{t} β^{*} + \frac{1}{k} [{(β^{*} - {\hat{β}}_{ML})}^{t} S (β^{*} - {\hat{β}}_{ML}) - \partial_{0}],$ (12) where $\frac{1}{k}$ represents the Lagrange multiplier and when it is differentiated with respect to $β^{*}$ by resulting zero then it can be shown as(13) $\frac{\partial F}{\partial β^{*}} = 2 β^{*} + \frac{1}{k} (2 S β^{*} - 2 S {\hat{β}}_{ML}) = 0 .$ (13)

Thus, the final form of the IGRR estimator may be written after simplifying the EquationEq. (13)(13) $\frac{\partial F}{\partial β^{*}} = 2 β^{*} + \frac{1}{k} (2 S β^{*} - 2 S {\hat{β}}_{ML}) = 0 .$ (13) as(14) $β^{*} = {\hat{β}}_{RR} = {(S + k I_{p})}^{- 1} S {\hat{β}}_{ML},$ (14) where k represents the ridge parameter of the IGRR.

2.3. Bias and MSE of the IGRR

The bias and MSE of the IGRR estimator are derived as(15) $Bias ({\hat{β}}_{RR}) = E ({\hat{β}}_{RR}) - β = E [{(S + k I_{p})}^{- 1} S {\hat{β}}_{ML}] - β . = [{(S + k I_{p})}^{- 1}] S β - β .$ (15) (16) $Bias ({\hat{β}}_{RR}) = {(S + k I_{p})}^{- 1} (S + k I_{p} - k I_{p}) β - β . = [{(S + k I_{p})}^{- 1} (S + k I_{p}) - k {(S + k I_{p})}^{- 1}] β - β . = [I_{p} - k {(S + k I_{p})}^{- 1}] β - β .$ (16)

The bias of IGRR is written as(17) $Bias ({\hat{β}}_{RR}) = [- k {(S + k I)}^{- 1}] β,$ (17)

The matrix MSE (MMSE) of ${\hat{β}}_{RR} of β$ can be written as $MMSE ({\hat{β}}_{RR}) = E {({\hat{β}}_{RR} - β)}^{t} ({\hat{β}}_{RR} - β) = Var ({\hat{β}}_{RR}) + Bias ({\hat{β}}_{RR}) Bias {({\hat{β}}_{RR})}^{t}$ (18) $MMSE ({\hat{β}}_{RR}) = {(S + k I_{p})}^{- 1} S {(S + k I_{p})}^{- 1} + k^{2} [{(S + k I)}^{- 1}] β β^{t} [{(S + k I)}^{- 1}] .$ (18)

The scalar MSE of ${\hat{β}}_{RR}$ can be found by applying trace operator on EquationEq. (18)(18) $MMSE ({\hat{β}}_{RR}) = {(S + k I_{p})}^{- 1} S {(S + k I_{p})}^{- 1} + k^{2} [{(S + k I)}^{- 1}] β β^{t} [{(S + k I)}^{- 1}] .$ (18) , we have(19) $MSE ({\hat{β}}_{RR}) = t r [MSE ({\hat{β}}_{RR})] = t r [Var ({\hat{β}}_{RR})] + Bias ({\hat{β}}_{RR}) Bias {({\hat{β}}_{RR})}^{t} .$ (19) (20) $MSE ({\hat{β}}_{RR}) = \hat{ϕ} \sum_{j = 1}^{p + 1} \frac{λ_{j}}{{(λ_{j} + k I_{p})}^{2}} + k^{2} \sum_{j = 1}^{p + 1} \frac{α_{j}^{2}}{{(λ_{j} + k I_{p})}^{2}} = γ_{1} (k) + γ_{2} (k),$ (20) where $λ_{j}$ is the jth eigenvalue of matrix S and $α = Q^{t} β_{ML}$ and $Q$ is the eigenvector that can be defined as $S = Q^{t} Λ Q,$ where $Λ = diag (λ_{1}, λ_{2}, \dots, λ_{p + 1}) .$

2.4. The superiority of the IGRR to the MLE

Hoerl and Kennard (Citation1970) derived three theorems about the MSE properties of RR estimator in the LRM. Here, we show that these theorems also hold for the IGRR estimator. In addition, we also show the superiority of the IGRR estimator to the MLE.

Theorem 2.4.1: The total variance $γ_{1} (k)$ is a continuous, monotonically decreasing function of k.

Proof.

The first derivative of $γ_{1} (k)$ w.r.t k using EquationEq. (20)(20) $MSE ({\hat{β}}_{RR}) = \hat{ϕ} \sum_{j = 1}^{p + 1} \frac{λ_{j}}{{(λ_{j} + k I_{p})}^{2}} + k^{2} \sum_{j = 1}^{p + 1} \frac{α_{j}^{2}}{{(λ_{j} + k I_{p})}^{2}} = γ_{1} (k) + γ_{2} (k),$ (20) is given by

(21)

γ_{1}^{'} (k) = \frac{d γ_{1} (k)}{d k} = - 2 \hat{ϕ} \sum_{j = 1}^{p + 1} \frac{λ_{j}}{{(λ_{j} + k)}^{3}}

(21)

The EquationEq. (21)(21) $γ_{1}^{'} (k) = \frac{d γ_{1} (k)}{d k} = - 2 \hat{ϕ} \sum_{j = 1}^{p + 1} \frac{λ_{j}}{{(λ_{j} + k)}^{3}}$ (21) shows that $γ_{1} (k)$ is a continuous, monotonically decreasing function of k, since $λ_{j} > 0$ and $k > 0 .$

Theorem 2.4.2: The squared bias $γ_{2} (k)$ is a continuous, monotonically increasing function of k.

Proof.

The first derivative of $γ_{2} (k)$ with respect to k using EquationEq. (20)(20) $MSE ({\hat{β}}_{RR}) = \hat{ϕ} \sum_{j = 1}^{p + 1} \frac{λ_{j}}{{(λ_{j} + k I_{p})}^{2}} + k^{2} \sum_{j = 1}^{p + 1} \frac{α_{j}^{2}}{{(λ_{j} + k I_{p})}^{2}} = γ_{1} (k) + γ_{2} (k),$ (20) , we have

γ_{2}^{'} (k) = \frac{d γ_{2} (k)}{d k} = - 2 k^{2} \sum_{j = 1}^{p + 1} \frac{α_{j}^{2}}{{(λ_{j} + k)}^{3}} + 2 k \sum_{j = 1}^{p + 1} \frac{α_{j}^{2}}{{(λ_{j} + k)}^{3}} = 2 k \sum_{j = 1}^{p + 1} \frac{{λ_{j} α}_{j}^{2}}{{(λ_{j} + k)}^{3}}

EquationEquation (22)(1) $f (y; μ, σ^{2}) = \frac{1}{\sqrt{2 π y^{3} σ^{2}}} exp - {\frac{{(y - μ)}^{2}}{2 μ^{2} y σ^{2}}}; y > 0$ (1) clearly shows that $γ_{2} (k)$ is a continuous, monotonically increasing function of k, since $λ_{j} > 0$ and $k > 0 .$

Theorem 2.4.3: There always exist a $k > 0$ such that $E (L_{RR}^{2}) < E (L_{ML}^{2}) = \hat{ϕ} \sum_{j = 1}^{p + 1} \frac{1}{λ_{j}} .$

Proof.

The first derivative of EquationEq. (20)(20) $MSE ({\hat{β}}_{RR}) = \hat{ϕ} \sum_{j = 1}^{p + 1} \frac{λ_{j}}{{(λ_{j} + k I_{p})}^{2}} + k^{2} \sum_{j = 1}^{p + 1} \frac{α_{j}^{2}}{{(λ_{j} + k I_{p})}^{2}} = γ_{1} (k) + γ_{2} (k),$ (20) with respect to k equals

(23)

E^{'} (L_{RR}^{2}) = \frac{d E (L_{RR}^{2})}{d k} = γ_{1}^{'} (k) + γ_{2}^{'} (k) = - 2 \hat{ϕ} \sum_{j = 1}^{p + 1} \frac{λ_{j}}{{(λ_{j} + k)}^{3}} + 2 k \sum_{j = 1}^{p + 1} \frac{{λ_{j} α}_{j}^{2}}{{(λ_{j} + k)}^{3}}

(23)

First note that from EquationEq. (20)(20) $MSE ({\hat{β}}_{RR}) = \hat{ϕ} \sum_{j = 1}^{p + 1} \frac{λ_{j}}{{(λ_{j} + k I_{p})}^{2}} + k^{2} \sum_{j = 1}^{p + 1} \frac{α_{j}^{2}}{{(λ_{j} + k I_{p})}^{2}} = γ_{1} (k) + γ_{2} (k),$ (20) , if $k = 0,$ then $γ_{1} (0) = \hat{ϕ} \sum_{j = 1}^{p + 1} \frac{1}{λ_{j}}$ and $γ_{2} (0) = 0 .$ In Theorem 2.4.1 and 2.4.2, it was also confirmed that $γ_{1} (k)$ and $γ_{2} (k)$ are monotonically decreasing and increasing functions, respectively. Moreover, $γ_{1}^{'} (k)$ and $γ_{2}^{'} (k)$ are always non-positive and non-negative, correspondingly. Hence, to verify the theorem, it is only necessary to show that there always exist a $k > 0$ such that $E^{'} (L_{RR}^{2}) < 0 .$ As noted from EquationEq. (23)(1) $f (y; μ, σ^{2}) = \frac{1}{\sqrt{2 π y^{3} σ^{2}}} exp - {\frac{{(y - μ)}^{2}}{2 μ^{2} y σ^{2}}}; y > 0$ (1) , one can easily observe that the condition for $E^{'} (L_{RR}^{2}) < 0$ to be hold if $k < \frac{\hat{ϕ}}{α_{\max}^{2}} .$

3. Optimal choice of ridge parameters

In this section, we modify the existing ridge parameters for the IGRR. In addition, we proposed new ridge parameters for the IGRR.

3.1. Existing best ridge parameters in the GLM

Though, various methods have been recommended for the LRM. Such methods are generalized for the GLM. First, we consider the most popular one, which is consider in various GLM based on the work of Hoerl, Kennard, and Baldwin (Citation1975) and we modify it for the IGRR as: $k_{1} = k_{HKB} = \frac{p \hat{ϕ}}{{\hat{α}}^{t} \hat{α}} .$

Månsson and Shukur (Citation2011) recommended the best ridge parameter for the Poisson RR and we modify this parameter for the IGRR as: $k_{2} = \max (\sqrt{\frac{α_{j}^{2}}{\hat{ϕ}}})$

Månsson (Citation2012) suggested the similar ridge parameter ( $k_{2}$ ) with addition of some other best ridge parameters for the negative binomial RR and we modified these parameters for the IGRR as: $k_{3} = {(Π_{j = 1}^{p + 1} \sqrt{\frac{α_{j}^{2}}{\hat{ϕ}}})}^{\frac{1}{p + 1}}, k_{4} = median (\sqrt{\frac{α_{j}^{2}}{\hat{ϕ}}})$ Kibria, Månsson, and Shukur (Citation2012) recommend the best ridge parameter for the logistic RR and we modify for the IGRR as: $k_{5} = Π_{j = 1}^{p + 1} {(\frac{1}{q_{j}})}^{\frac{1}{p + 1}}, k_{6} = median (q_{j}),$ where $q_{j} = \frac{λ_{\max} \hat{ϕ}}{(n - p) \hat{ϕ} + λ_{\max} {\hat{α}}_{j}^{2}} .$ Amin et al. (Citation2020a) have found the following best ridge parameters for the Gamma RR model as $k_{7} = diag (\frac{\hat{ϕ}}{α_{j}^{2}} [1 + {1 + {(\frac{\hat{ϕ}}{α_{j}^{2}})}^{2}}]) k_{8} = \frac{\hat{ϕ}}{{(Π_{j = 1}^{p + 1} α_{j}^{2})}^{\frac{1}{p + 1}}}$

3.2. Proposed ridge parameters for the IGRR

To find the optimal value of k, we equate EquationEq. (20)(20) $MSE ({\hat{β}}_{RR}) = \hat{ϕ} \sum_{j = 1}^{p + 1} \frac{λ_{j}}{{(λ_{j} + k I_{p})}^{2}} + k^{2} \sum_{j = 1}^{p + 1} \frac{α_{j}^{2}}{{(λ_{j} + k I_{p})}^{2}} = γ_{1} (k) + γ_{2} (k),$ (20) to zero and solve for the ridge parameter. After simplification, we can define the general optimal ridge parameter as(24) $k_{j} = \frac{\hat{ϕ}}{{\hat{α}}_{j}^{2}}$ (24) Using EquationEq. (24)(24) $k_{j} = \frac{\hat{ϕ}}{{\hat{α}}_{j}^{2}}$ (24) , we propose the following new ridge parameters $k_{9} = diag (k_{j}), k_{10} = \max (k_{j}), k_{11} = \min (k_{j}), k_{12} = \frac{\sum_{j = 1}^{p + 1} k_{j}}{p + 1}, k_{13} = median (k_{j}), k_{14} = {(\prod_{j = 1}^{p + 1} k_{j})}^{\frac{1}{p + 1}}, k_{15} = \frac{\hat{ϕ} \sum_{j = 1}^{p + 1} {\hat{λ}}_{j}}{\sum_{j = 1}^{p + 1} {\hat{α}}_{j}^{2} {\hat{λ}}_{j}},$ where ${\hat{λ}}_{j}$ is the $j th$ estimated eigenvalue of $X^{t} \hat{W} X$ such that $λ_{1} \geq λ_{2} \geq, \dots, \geq λ_{j} .$

Now, we compare the performance of proposed ridge parameters with the available best ridge parameters for the IGRR model to determine which one performs better in terms of minimum MSE.

4. The Monte Carlo simulation

The main aim of this study is to propose the ridge estimators for the IGRM. In addition, we propose some new estimation methods for ridge parameter in the IGRR. The performance of the proposed estimators is compared with the existing ridge estimators via Monte Carlo simulation under consider different factors in sense of MSE.

4.1. The design of an experiment

We used MSE as performance criterion to gauge the performance of estimators. The MSE is defined as(25) $MSE = \frac{\sum_{i = 1}^{M} {({\hat{β}}_{i} - β)}^{t} ({\hat{β}}_{i} - β)}{M},$ (25) where M is number of replications and are set to be 5000 and ${\hat{β}}_{i}$ is the estimated value of $β$ in ith replication. The dependent variable of the IGRM is obtained from the IG distribution with mean is given below(26) $μ_{i} = E (y_{i}) = {(β_{0} + β_{1} x_{i 1} + β_{2} x_{i 2} + \dots + β_{p} x_{i p})}^{- \frac{1}{2}}, i = 1, 2, \dots, n j = 1, 2, \dots, p .$ (26)

The IGRM in EquationEq. (26)(26) $μ_{i} = E (y_{i}) = {(β_{0} + β_{1} x_{i 1} + β_{2} x_{i 2} + \dots + β_{p} x_{i p})}^{- \frac{1}{2}}, i = 1, 2, \dots, n j = 1, 2, \dots, p .$ (26) is generated for p = 3, 6, and 9 explanatory variables, respectively. The regression parameter values of $β$ are selected so that $\sum_{j = 1}^{p + 1} β_{j}^{2} = 1,$ which are common restrictions in simulation studies. However, the correlated explanatory variables are generated as(27) $x_{i j} = {(1 - {α_{0}}^{2})}^{1 / 2} Z_{i j} + α_{0} Z_{i (p + 1),} i = 1, 2, \dots, n j = 1, 2, \dots, p .$ (27) where ${α_{0}}^{2}$ is the correlation among the explanatory variables and $Z_{i p}$ represents the independent standard normal pseudorandom number. The Monte Carlo simulation are designed under different factors and these factors are presented in .

Table 1. Assumed values of different factors for Monte Carlo simulation.

Display Table

4.2. Results and discussion

The MSE of the ML and IGRR estimators under different factors are obtained from simulation experiments and simulated results are given in . presents the estimated MSEs of proposed ridge parameters for p = 3. It can be observed from the simulation results that the MSE is inflated under different values of collinearity, dispersion, and sample sizes. It can be viewed from the results that MSE of the ML estimator is large while the MSE of the IGRR estimator is small. We considered different ridge parameters for the IGRR model and different patterns in terms of MSEs under different conditions can be observed. The MSEs of the ridge parameters i.e. $k_{1} - k_{15}$ have increasing pattern with increasing degree of collinearity from lower to a higher level by taking the constant values of n and $ϕ .$ The above statement is also true for all p, n, and $ϕ .$ The MSEs of all the ridge parameters are increasing as the value of dispersion is increases. Moreover, the MSE of the ridge parameters is affected inversely by the sample sizes. All ridge parameters attain smaller MSE with an increase in sample sizes. This statement is true for all levels of collinearity, number of explanatory variables, and the dispersion. Another factor which affects the MSE of ridge parameters is the number of explanatory variables. This factor has a direct effect on the MSE’s of different ridge parameters of the IGRR model. So, the MSE of the ridge parameters is increased with the increase in the number of explanatory variables on fitting the IGRR model. On comparing the ridge parameters, it is found that for $p = 3, n = 25, ϕ = 0.10$ and all collinearity levels, the performance of ridge parameter $k_{10}$ is well than the other ridge choices because it attains smaller MSE among all other ridge parameters. While for $n \geq 50, p = 3, ϕ = 0.10$ and all collinearity levels, the ridge parameter $k_{6}$ is the best choice than others. This ridge parameter is also performed well for $ϕ \geq 2$ with the other similar conditions as stated above. Moreover, the ridge parameter $k_{10}$ looks good when $0.10 < ϕ < 2 .$ From and , we observed that for p = 6 and p = 9, with moderate multicollinearity and small sample size, the ridge parameter $k_{7}$ found to be better than other ridge parameters with minimum MSE. While for these explanatory variables, sample size, high multicollinearity, and $ϕ = 0.10,$ the ridge parameter $k_{10}$ gives the smaller MSE as other ridge parameters. On the whole, for these explanatory variables and other parametric conditions, the performance of two ridge parameters i.e. $k_{7}$ and $k_{10}$ found to be better than other ridge parameters. So, we can say that $k_{6},$ $k_{7},$ and $k_{10}$ are the best ridge parameters for the IGRR model to overcome the issue of multicollinearity.

5. Applications

In this section, we evaluate the performance of proposed methods using two data sets from real life i.e. the stack loss data and nitrogen dioxides data.

5.1. Application 1: stack loss data

Brownlee (Citation1965) first used the stack loss data for the model inferences. This data set consists of $n = 21$ operations of a plant for the oxidation of ammonia to nitric acid with dependent (stack loss) and $p = 3$ explanatory variables (air flow, cooling water inlet temperature, and acid concentration). The representation of these variables are as follows, $y =$ the percent of the ingoing ammonia that is lost by escaping in the unabsorbed nitric oxides, $x_{1} =$ air flow (which reflects the rate of operation of the plant), $x_{2} =$ temperature of the cooling water in the coils of the absorbing tower for the nitric oxides, $x_{3} =$ concentration of nitric acid in the absorbing liquid. We use this data set to compare the applicability of proposed ridge parameters for the IGRR because stack loss data is well fitted to the IG distribution. The distribution fitting results using goodness of fit tests are given in .

Table 5. Goodness of fit tests.

Display Table

The problem of multicollinearity is tested using condition index (CI). We observed that the $CI = \sqrt{λ_{\max} / λ_{\min}} = 145.16 .$ This indicates the existence of severe multicollinearity among the explanatory variables. So, we use the IGRR to overcome the effect of multicollinearity in the IGRM. After fitting the IGRM, the MSEs of the MLE and IGRR with different ridge parameters are computed using EquationEqs. (10)(10) $E (L_{ML}^{2}) = E {({\hat{β}}_{ML} - β)}^{t} ({\hat{β}}_{ML} - β) = \hat{ϕ} t r {(S)}^{- 1} = \hat{ϕ} \sum_{j = 1}^{p + 1} \frac{1}{λ_{j}},$ (10) and Equation(20)(20) $MSE ({\hat{β}}_{RR}) = \hat{ϕ} \sum_{j = 1}^{p + 1} \frac{λ_{j}}{{(λ_{j} + k I_{p})}^{2}} + k^{2} \sum_{j = 1}^{p + 1} \frac{α_{j}^{2}}{{(λ_{j} + k I_{p})}^{2}} = γ_{1} (k) + γ_{2} (k),$ (20) , respectively. The estimated regression coefficients and MSEs of different ridge parameters of the IGRR and the MLE are given in . On comparing the performance of proposed ridge parameters, it is observed that our purposed ridge parameters perform better than that of those which were found to be better for other GLMs. The ridge parameters k₁₀ and k₁₅ for this data are the best ridge parameters because these attains the minimum MSE as compared to other ridge parameters.

Table 6. The estimated regression coefficients and MSE for stack loss data.

Display Table

Table 7. Goodness of fit tests.

Display Table

5.2. Application 2: nitrogen dioxides data

For the evaluation of proposed estimator, we consider another example i.e. nitrogen dioxides data. The main aim of this application is to check the effect of three explanatory variables (p = 3) on the response variable (nitrogen dioxides measured in ppm). The explanatory variables are: $x_{1}$ represents the humidity in the air, $x_{2}$ denotes the temperature, and $x_{3}$ indicated the barometric pressure. First, it is necessary to find the distribution of response variable to find out the appropriate regression model. showed that three tests are used to test the distribution of the response variable. Results of Crammer-Von Misses test indicated that the most suitable one is the IG distribution with test statistic (p-value) equals to 0.0518 (0.6266). Moreover, the estimated dispersion parameter found to be 0.0042.

Second, is to test the multicollinearity via CI. The estimated value of CI equals to 58.0951 indicating severe multicollinearity among explanatory variables. The nitrogen dioxides data is well fitted to the IG distribution and the explanatory variables are collinear. So, we apply the IGRR to measure the effect of humidity, temperature, and barometric pressure on nitrogen oxides and check the supremacy of proposed ridge parameters.

The estimated regression coefficients with different estimators and respective MSEs are given in . From , we observed that IGRR with new shrinkage parameters attains minimum MSE as compared to the best ridge parameters which are proposed for the other different GLMs. On comparing the all ridge parameters of the IGRR, we observed that the performance of ridge parameters k₁₀, k₁₂, and k₁₅ is better than the other shrinkage parameters

Table 8. The estimated regression coefficients and MSE for nitrogen dioxides data.

Display Table

6. Conclusion

The choice of the ridge parameter for regression modeling with correlated explanatory variables is necessary for better decision making. As the available literature shows that GLM with different exponential distributions has different best ridge parameters. In this article, some new ridge parameters for the IGRR model are therefore proposed to overcome the problem of multicollinearity. The proposed ridge parameters are evaluated by the simulation study and chemometric data sets. The performance of these ridge parameters is assessed based on MSE. The simulation results demonstrated that the MSE properties of the MLE and IGRR estimators are affected by changing the factors such as multicollinearity, sample size, number of explanatory variables and dispersion. It is shown that under multicollinearity IGRR estimators outperform the ML method. On comparing the performance of our proposed ridge parameters with the other GLM best ridge parameters for the IGRR, it is observed that mostly the proposed ridge parameter $k_{10}$ have minimum MSE as compared to other GLM best ridge parameters. The chemometric data set results also support the simulation study results. Based on Monte Carlo simulation and chemometric examples results, the ridge parameter $k_{10}$ is suggested as the best option for applying IGRM with correlated explanatory variables.

Acknowledgments

The authors are grateful to anonymous referees for their valuable comments that led to improvements in the paper.

References

Akram, M. N., M. Amin, and M. Amanullah. 2020. Two-parameter estimator for the inverse Gaussian regression model. Communications in Statistics-Simulation and Computation 1:31. doi:10.1080/03610918.2020.1797797.
Web of Science ®Google Scholar
Akram, M. N., M. Amin, and M. Qasim. 2020. A new Liu-type estimator for the inverse Gaussian regression model. Journal of Statistical Computation and Simulation 90 (7):1153–72. doi:10.1080/00949655.2020.1718150.
Web of Science ®Google Scholar
Algamal, Z. Y. 2018a. Developing a ridge estimator for the gamma regression model. Journal of Chemometrics 32 (10):e3054. doi:10.1002/cem.3054.
Web of Science ®Google Scholar
Algamal, Z. Y. 2018b. A new method for choosing the biasing parameter in ridge estimator for generalized linear model. Chemometrics and Intelligent Laboratory Systems 183:96–101. doi:10.1016/j.chemolab.2018.10.014.
Web of Science ®Google Scholar
Algamal, Z. Y. 2018c. Shrinkage estimators for gamma regression model. Electronic Journal of Applied Statistical Analysis 11 (01):253–68.
Google Scholar
Algamal, Z. Y. 2019. Performance of ridge estimator in inverse Gaussian regression model. Communications in Statistics-Theory and Methods 48 (15):3836–49.
Web of Science ®Google Scholar
Amin, M., M. Amanullah, and M. Aslam. 2016. Empirical evaluation of the inverse Gaussian regression residuals for the assessment of influential points. Journal of Chemometrics 30 (7):394–404. doi:10.1002/cem.2805.
Web of Science ®Google Scholar
Amin, M., M. Amanullah, and M. Qasim. 2020. Diagnostic techniques for the inverse Gaussian regression model. Communications in Statistics - Theory and Methods. Advance online publication. doi:10.1080/03610926.2020.1777308.
PubMed Web of Science ®Google Scholar
Amin, M., M. N. Akram, and M. Amanullah. 2020. On the James-Stein estimator for the Poisson regression model. Communications in Statistics - Simulation and Computation. Advance online publication. doi:10.1080/03610918.2020.1775851.
PubMed Web of Science ®Google Scholar
Amin, M., M. Qasim, A. Yasin, and M. Amanullah. 2020a. Almost unbiased ridge estimator in the gamma regression model. Communications in Statistics-Simulation and Computation. Advance online publication. doi:10.1080/03610918.2020.1722837.
PubMed Web of Science ®Google Scholar
Amin, M., M. Qasim, and M. Amanullah. 2019. Performance of Asar and Genç and Huang and Yang’s two-parameter estimation methods for the gamma regression model. Iranian Journal of Science and Technology, Transactions A: Science 43 (6):2951–63. doi:10.1007/s40995-019-00777-3.
Web of Science ®Google Scholar
Amin, M., M. Qasim, M. Amanullah, and S. Afzal. 2020b. Performance of some ridge estimators for the gamma regression model. Statistical Papers 61 (3):997–1026. doi:10.1007/s00362-017-0971-z.
Web of Science ®Google Scholar
Brownlee, K. A. 1965. Statistical theory and methodology in science and engineering, vol. 150, 120–31. New York: Wiley.
Google Scholar
Hardin, J. W., and J. M. Hilbe. 2012. Generalized estimating equations. USA: Chapman and Hall/CRC.
Google Scholar
Hoerl, A. E., and R. W. Kennard. 1970. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12 (1):55–67. doi:10.1080/00401706.1970.10488634.
Web of Science ®Google Scholar
Hoerl, A., R. Kennard, and K. Baldwin. 1975. Ridge regression: Some simulations. Communications in Statistics - Simulation and Computation 4 (2):105–23. doi:10.1080/03610917508548342.
Web of Science ®Google Scholar
Khalaf, G., and G. Shukur. 2005. Choosing ridge parameter for regression problems. Communications in Statistics - Theory and Methods 34 (5):1177–82. doi:10.1081/STA-200056836.
Web of Science ®Google Scholar
Kibria, B. M. G. 2003. Performance of some new ridge regression estimators. Communications in Statistics - Simulation and Computation 32 (2):419–35. doi:10.1081/SAC-120017499.
Web of Science ®Google Scholar
Kibria, B. M. G., K. Månsson, and G. Shukur. 2012. Performance of some logistic ridge regression estimators. Computational Economics 40 (4):401–14. doi:10.1007/s10614-011-9275-x.
Web of Science ®Google Scholar
Kibria, B. M. G., K. Månsson, and G. Shukur. 2015. A simulation study of some biasing parameters for the ridge type estimation of Poisson regression. Communications in Statistics - Simulation and Computation 44 (4):943–57. doi:10.1080/03610918.2013.796981.
Web of Science ®Google Scholar
Kinat, S., M. Amin, and T. Mahmood. 2020. GLM-based control charts for the inverse Gaussian response variable. Quality and Reliability Engineering International 36 (2):765–83. doi:10.1002/qre.2603.
Web of Science ®Google Scholar
Månsson, K. 2012. On ridge estimators for the negative binomial regression model. Economic Modelling 29 (2):178–84. doi:10.1016/j.econmod.2011.09.009.
Web of Science ®Google Scholar
Månsson, K., and G. Shukur. 2011. A Poisson ridge regression estimator. Economic Modelling 28 (4):1475–81. doi:10.1016/j.econmod.2011.02.030.
Web of Science ®Google Scholar
McDonald, G. C., and D. I. Galarneau. 1975. A Monte Carlo evaluation of some ridge-type estimators. Journal of the American Statistical Association 70 (350):407–16. doi:10.1080/01621459.1975.10479882.
Web of Science ®Google Scholar
Muniz, G., and B. M. G. Kibria. 2009. On some ridge regression estimators: An empirical comparisons. Communications in Statistics - Simulation and Computation 38 (3):621–30. doi:10.1080/03610910802592838.
Web of Science ®Google Scholar
Naveed, K., M. Amin, S. Afzal, and M. Qasim. 2020. New shrinkage parameters for the inverse Gaussian liu regression. Communications in Statistics-Theory and Methods. Advance online publication. doi:10.1080/03610926.2020.1791339.
PubMed Web of Science ®Google Scholar
Qasim, M., B. M. G. Kibria, K. Månsson, and P. Sjölander. 2019. A new Poisson Liu regression estimator: Method and application. Journal of Applied Statistics. Advance online publication. doi:10.1080/02664763.2019.1707485.
PubMed Web of Science ®Google Scholar
Segerstedt, B. 1992. On ordinary ridge regression in generalized linear models. Communications in Statistics - Theory and Methods 21 (8):2227–46. doi:10.1080/03610929208830909.
Web of Science ®Google Scholar
Shamany, R. E., N. Z. Alobaidi, and Z. Y. Algamal. 2019. A new two-parameter estimator for the inverse Gaussian regression model with application in chemometrics. Electronic Journal of Applied Statistical Analysis 12 (02):453–64.
Google Scholar

New ridge estimators in the inverse Gaussian regression: Monte Carlo simulation and application to chemical data

Abstract

1. Introduction

2. Theory and method

2.1. The IGRM

2.2. The IGRR

2.3. Bias and MSE of the IGRR

2.4. The superiority of the IGRR to the MLE

3. Optimal choice of ridge parameters

3.1. Existing best ridge parameters in the GLM

3.2. Proposed ridge parameters for the IGRR

4. The Monte Carlo simulation

4.1. The design of an experiment

Table 1. Assumed values of different factors for Monte Carlo simulation.

4.2. Results and discussion

Table 2. Simulated MSE when $p = 3 .$

Table 3. Simulated MSE when $p = 6 .$

Table 4. Simulated MSE when $p = 9 .$

5. Applications

5.1. Application 1: stack loss data

Table 5. Goodness of fit tests.

Table 6. The estimated regression coefficients and MSE for stack loss data.

Table 7. Goodness of fit tests.

5.2. Application 2: nitrogen dioxides data

Table 8. The estimated regression coefficients and MSE for nitrogen dioxides data.

6. Conclusion

Acknowledgments

References

Information for

Open access

Opportunities

Help and information

New ridge estimators in the inverse Gaussian regression: Monte Carlo simulation and application to chemical data

Abstract

1. Introduction

2. Theory and method

2.1. The IGRM

2.2. The IGRR

2.3. Bias and MSE of the IGRR

2.4. The superiority of the IGRR to the MLE

3. Optimal choice of ridge parameters

3.1. Existing best ridge parameters in the GLM

3.2. Proposed ridge parameters for the IGRR

4. The Monte Carlo simulation

4.1. The design of an experiment

Table 1. Assumed values of different factors for Monte Carlo simulation.

4.2. Results and discussion

Table 2. Simulated MSE when p=3.

Table 3. Simulated MSE when p=6.

Table 4. Simulated MSE when p=9.

5. Applications

5.1. Application 1: stack loss data

Table 5. Goodness of fit tests.

Table 6. The estimated regression coefficients and MSE for stack loss data.

Table 7. Goodness of fit tests.

5.2. Application 2: nitrogen dioxides data

Table 8. The estimated regression coefficients and MSE for nitrogen dioxides data.

6. Conclusion

Acknowledgments

References

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date

Table 2. Simulated MSE when $p = 3 .$

Table 3. Simulated MSE when $p = 6 .$

Table 4. Simulated MSE when $p = 9 .$