Abstract
The ordinary least square (OLS) estimator suffers a breakdown in the presence of multicollinearity. The estimator is still unbiased but possesses a significant variance. In this study, we proposed an unbiased modified ridge-type estimator as an alternative to the OLS estimator and the biased estimators for handling multicollinearity in linear regression models. The properties of this new estimator were derived. The estimator is also unbiased with minimum variance. A real-life application to the higher heating value of poultry waste from proximate analysis and simulation study generally supported the findings.
1. Introduction
Consider the linear regression model (1) (1) where y is a vector of the dependent variable, is a known full rank matrix of explanatory variables, is a vector of regression coefficients and I is an identity matrix. The ordinary least squares estimator (OLS) of in model (1) is defined as: (2) (2) where
This estimator is the most widely used method to estimate the parameters in a linear regression model. It performs best when certain assumptions are satisfied. One of them is that the independent variables are not associated. However, in practice, there often exist strong or perfect linear relationships among the independent variables. This situation is called multicollinearity. The OLS estimator suffers a breakdown in the presence of multicollinearity. The estimator is still unbiased but possesses a significant variance (Ayinde, Lukman, Samuel, & Attah, Citation2018). Different approaches are available in the literature to handle this problem. These include Hoerl and Kennard (Citation1970), Swindel (Citation1976), Farebrother (Citation1976), Liu (Citation1993), Sakallioglu and Akdeniz (Citation2003), Ozkale and Kacıranlar (Citation2007), Yang and Chang (Citation2010), Li and Yang (Citation2012), Wu and Yang (Citation2013), Wu (Citation2014) and recently, Arumairajan and Wijekoon (Citation2017), Ayinde et al. (Citation2018), Lukman, Ayinde, Binuomote, and Onate (Citation2019). The estimators by these authors are biased. Crouse, Jin, and Hanumara (Citation1995) and Sakalloglu and Akdeniz (2003) proposed the unbiased version of the ridge estimator and Liu estimator, respectively, with the addition of prior information. These methods effectively handle the problem of multicollinearity and eliminate bias.
In this article, we proposed an unbiased modified ridge-type estimator (UMRT) with prior information and derived its properties. Furthermore, we discuss the performance of the proposed estimator over the OLS estimator, the Ridge estimator (RE) and the modified ridge-type estimator (MRT) using the mean square error matrix (MSEM) criteria.
The remaining part of this article is as follows. In Section 2, we proposed the unbiased modified ridge-type estimator and compared its performance with some existing estimators using the mean square error matrix (MSEM) criterion in Section 3. We estimate the biasing parameter k and d in Section 4. We conducted a simulation study and a real-life data application in Section 5. Finally, we provide some concluding remarks in Section 6.
2. Unbiased modified ridge-type estimator with prior information
Hoerl and Kennard (Citation1970) defined the ridge estimator of β as: (3) (3) where k is the biasing parameter.
Swindel (Citation1976) defined the ridge estimator with prior information b (4) (4)
Crouse et al. (Citation1995) introduced the unbiased ridge estimator based on the ridge estimator and prior information J. This is defined as (5) (5) where J and are uncorrelated and J ̴N(β, V) such that and is p × p identity matrix. J is estimated by
Lukman et al. (Citation2019) proposed the modified ridge-type estimator which is defined as follows: (6) (6) where
Studying the following convex estimator (7) (7) where C is a p × p matrix and I is a p × p identity matrix. Consequently, the mean square error of is (8) (8)
Then, (9) (9)
From (9), C is obtained to be Accordingly, The convex estimator has minimum MSE for optimal value of C and it’s an unbiased estimator of Therefore, the new estimator in this study is defined as (10) (10) where then, the value of Consequently, for k > 0, 0 ˂ d ˂ 1.
It is easy to show that is an unbiased estimator of The expectation vector, bias vector, dispersion matrix and mean square error matrix of the proposed estimator are: (11) (11) (12) (12) (13) (13)
Since Bias = 0, then, (14) (14)
Consequently, the estimator is an unbiased estimator of β.
Suppose there exist an orthogonal matrix Q such that where is the ith eigenvalue of and Q are the matrices of eigenvalues and eigenvectors of respectively. Model (1) can be written in canonical form as: (15) (15) where and For model (15), we get the following representations: (16) (16) (17) (17) (18) (18) (19) (19)
Lemma 2.1.
Let M be an positive definite matrix, that is M > 0, and be some vector, then if and only if (Farebrother, Citation1976).
Lemma 2.2.
Let be two linear estimators of . Suppose that , where denotes the covariance matrix of and . Consequently, (20) (20) if and only if where (Trenkler & Toutenburg, Citation1990).
3. Theoretical Comparisons
3.1. Comparison of the OLS estimator and the unbiased modified ridge-type estimator
Theorem 3.1.
The unbiased modified ridge-type estimator is superior to the OLS estimator in the mean square error sense for k > 0 and 0 < d < 1
Proof.
By Definition, (21) (21)
The MSEM difference between EquationEqs. (14)(14) (14) and Equation(21)(21) (21) (22) (22)
It was observed that will be positive definite if and only if However, for k > 0 and 0<d < 1, will be positive definite. By Lemma 2.2, the proof is completed.
3.2. Comparison of ridge estimator and the unbiased modified ridge-type estimator
From the representation, the mean square error matrix is (23) (23) (24) (24) where
The difference between and in term of the MSEM is (25) (25)
Let k > 0, 0<d < 1, thus, we have the following theorem.
Theorem 3.2.
Let us consider two estimators and . If k > 0 and 0 < d < 1, the estimator is superior to the estimator in the MSEM if and only if
Proof:
The difference between Eqs. (14) and (23) (26) (26)
We observed that will be positive definite if and only if or where k > 0 and 0<d < 1.
3.3. Comparison of modified ridge-type estimator and unbiased modified ridge-type estimator
From the representation, the dispersion and MSEM is defined as follows: (27) (27) where (28) (28)
Theorem 3.3.
The unbiased modified ridge type estimator always dominates the modified ridge type estimator in the MSEM sense for k > 0 and 0 < d < 1.
Proof
: The difference between (14) and (28) (29) (29)
Therefore, is a non-negative matrix for k > 0 and 0<d < 1. The proof of Theorem 3.3 is completed.
4. Estimation of the biasing parameters k and d
In this section, we discuss the estimation of the biasing parameter k and d.
4.1. The estimation of parameter d
In the definition of the new estimator, J and are uncorrelated. Therefore, and (30) (30)
From (30), if is known for a fixed k, we can get an unbiased estimator of d as follows: (31) (31)
When is unknown, s2 is used as an estimate of (32) (32)
Consequently, (33) (33) where and is the eigen-value of It was observed that the estimator of d in (33) can return a negative value. To eliminate the negative value, Wu (Citation2014) suggests replacing with one (1) when its estimate is negative. Here, in this study, when d in EquationEq. (33)(33) (33) is negative, we adopt the estimator of suggested by Ozkale and Kaciranlar (Citation2007) as follows: (34) (34)
4.2. Estimating the biasing parameter k
From EquationEq. (30)(30) (30) , if is known and d is assumed to be fixed, an unbiased estimate of k is defined as follows: (35) (35)
When is negative, estimate as follows: (36) (36)
5. Numerical example and Monte–Carlo simulation
5.1. Application to poultry waste data
The theoretical results are illustrated with real-life data which was analyzed in the study of Qian, Lee, Soto, and Chen (Citation2018). A total of 48 samples of poultry waste were collected from different published open literature reviews to form a database for derivation, evaluation and validation of proximate-based higher heating value (HHV) models. Six samples (#43, 44, 45, 46, 47 and 48) were deleted due to incomplete information. The linear regression model is: (37) (37) where HHV denotes Higher Heating Value, FC denotes Fixed Carbon, VM denotes Volatile Matter, A denotes ASH and is the random error term that is expected to be normally distributed. The relationship between the variables were obtained by the correlation matrix as follows.
From , there is a strong positive relationship between higher heating value and Fixed Carbon while a negative relationship exists between HHV and VM; HHV and Ash. To identify the distribution of the error term, we used the Jarque-Bera (JB) test. The test statistic and the corresponding p value are JB = 0.6409 and p value =.7258, respectively. Since this p value is larger than any reasonable alpha value used in the literature, we conclude that the error term follows the normal distribution. We diagnosed the model for a possible presence of multicollinearity. The variance inflation factor (VIF) values are VIFFC = 997.819, VIFVM = 2163.504, VIFASH = 1533.782. Literature shows that a model suffers from multicollinearity when VIFi>10. Since the values of the VIF in the above model is higher than 10, we conclude that the model suffers from severe multicollinearity. Alternatively, we can use the condition number (CN) to examine if the explanatory variables are related where If CN is between 100 and 1000 there is moderate to strong multicollinearity and if it exceeds 1000 there is severe multicollinearity (Arumairajan & Wijekoon, Citation2017; Gujarati, Citation1995). The condition number is 581291.39 which indicates the presence of severe multicollinearity. Therefore, it will be appropriate to predict higher heating value with an alternative unbiased estimator possessing minimum variance. We adopt K fold crossvalidation to validate the performances of the estimators. The data is partitioned into K equal size folds (K = 10 in this study). In these K folds, onefold will be treated as the test set and use the remaining K – 1 (9) folds as the training set. The MSE is computed on the observations in the held-out fold. The process is repeated ten times, taking out a different part each time. The validation test error is obtained by computing the average K estimates of the test error, and we get an estimated validation (test) error rate for new observations. The estimator with the lowest validation MSE is the best. The average MSE of the validation error in this study is defined as: (38) (38) where is the number of subsample in each fold, is the fitted value for observation i, obtained from the data with fold k removed. The result is presented in .
The result in shows that the unbiased modified ridge-type estimator (UMRT) produced the same estimates with the OLS estimator. Also, the technique was able to circumvent the problem of large variance which is peculiar to the OLS estimator. The proposed estimator has the smallest mean square error and prediction error, respectively.
5.2. Monte–Carlo simulation
We carried out a Monte–Carlo simulation to investigate the performances of these estimators. The explanatory variables were generated in line with the study of McDonald and Galarneau (Citation1975), Liu (Citation1993) and Lukman and Ayinde (Citation2017). This is defined as: (39) (39) where is independent standard normal distribution with mean zero and unit variance, is the correlation between any two explanatory variables and p is the number of explanatory variables. The values of were taken as 0.85, 0.95 and 0.99, respectively. In this study, the number of explanatory variable (p) was taken to be three and six.
The response variable is defined as: (40) (40) where The values of were chosen such that = 1 (Newhouse & Oman, Citation1971). The sample size used are 30 and 50. Two different values of σ: 1 and 5. The experiment is repeated 1000 times. The estimated MSE is calculated as (41) (41) where denotes the estimate of the ith parameter in jth replication and βi is the true parameter values. The estimated MSEs of the estimators for different values of n, k, d, σ and are shown in . The following observations were made:
The unbiased estimator is superior to OLS in all the cases. OLS estimator has the least performance when there is multicollinearity.
Also, the unbiased estimator consistently outperforms the ridge and modified ridge estimators. Even though, ridge and modified ridge estimators dominate OLS in all cases.
When the sample size increase, the MSE decreases even when the correlation between the explanatory variables increases.
As sample sizes remain constant, increasing the value of σ increases the mean square errors of each of the estimators.
As the number of explanatory variables increases, the mean squared error of all the estimators’ increases for a given level of multicollinearity and σ.
Generally, we confirm the superiority of the unbiased estimator over other estimators at the different level of multicollinearity and error variance. The performance of the modified-ridge estimator dominates the ridge estimator and OLS.
6. Conclusion
The OLS estimator suffers a breakdown in the presence of multicollinearity. The estimator is unbiased but possesses a significant variance. An alternative estimator called unbiased modified ridge-type estimator with prior information was proposed in this study. This estimator was proved to be unbiased and possess minimum variance theoretically. Also, a simulation study and real-life application were conducted to establish the superiority of this estimator over the existing estimators in terms of the MSEM criterion and crossvalidation prediction error. The performance of this new estimator is better than the OLS estimator and ridge estimator for all degree of multicollinearity. This estimator was able to circumvent the problem of inflated variance that faces the OLS estimator. Finally, this estimator should be adopted as a replacement to the OLS estimator and the biased estimators when there is multicollinearity in a linear model.
Acknowledgements
The authors are grateful to the anonymous reviewers for their valuable comments and suggestions, which certainly improved the quality and presentation of this article.
Disclosure statement
No potential conflict of interest was reported by the authors.
References
- Arumairajan, S., & Wijekoon, P. (2017). Modified almost unbiased liu estimator in linear regression model. Communications in Mathematics and Statistics, 5, 261–276. doi:10.1007/s40304-017-0111-z
- Ayinde, K., Lukman, A. F., Samuel, O. O., & Attah, O. M. (2018). Some new adjusted ridge estimators of linear regression model. International Journal of Civil Engineering and Technology, 11, 2838–2852.
- Crouse, R. H., Jin, C., & Hanumara, R. C. (1995). Unbiased ridge estimation with prior information and ridge trace. Communications in Statistics—Theory and Methods, 24, 2341–2354. doi:10.1080/03610929508831620
- Farebrother, R. W. (1976). Further results on the mean square error of ridge regression. Journal of the Royal Statistical Society: Series B (Methodological), B38, 248–250. doi:10.1111/j.2517-6161.1976.tb01588.x
- Gujarati, D. N. (1995). Basic econometrics. New York, NY: McGraw-Hill.
- Hoerl, A. E., & Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1), 55–67. doi:10.1080/00401706.1970.10488634
- Li, Y., & Yang, H. (2012). A new Liu-type estimator in linear regression model. Statistical Papers, 53, 427–437. doi:10.1007/s00362-010-0349-y
- Liu, K. (1993). A new class of biased estimate in linear regression. Communications in Statistics - Theory and Methods, 22, 393–402.
- Lukman, A. F., & Ayinde, K. (2017). Review and classifications of the ridge parameter estimation techniques. Hacettepe Journal of Mathematics and Statistics, 46, 953–967. doi:10.15672/HJMS.201815671
- Lukman, A. F., Ayinde, K., Binuomote, S., & Onate, A. C. (2019). Modified ridge-type estimator to combat multicollinearity: Application to chemical data. Journal of Chemometrics, e3125. 10.1002/cem.3125
- McDonald, M. C., & Galarneau, D. I. (1975). A Monte Carlo evaluation of some ridge-type estimators. Journal of the American Statistical Association, 70, 407–416. doi:10.2307/2285832
- Newhouse, J. P., & Oman, S. D. (1971). An evaluation of ridge estimators. Rand Report, 1–28. R-716-PR.
- Ozkale, M. R., & Kaciranlar, S. (2007). The restricted and unrestricted two-parameter estimators. Communications in Statistics - Theory and Methods, 36, 2707–2725.
- Qian, X., Lee, S., Soto, A., & Chen, G. (2018). Regression model to predict the higher heating value of poultry waste from proximate analysis. Resources, 7, 39. doi:10.3390/resources7030039
- Sakallioglu, S., & Akdeniz, F. (2003). Unbiased Liu estimation with prior information. International Journal of Mathematical Sciences, 2(1), 205–217.
- Swindel, F. F. (1976). Good ridge estimators based on prior information. Communications in Statistics - Theory and Methods, 11, 1065–1075. doi:10.1080/03610927608827423
- Trenkler, G., & Toutenburg, H. (1990). Mean squared error matrix comparisons between biased estimators an overview of recent results. Statistical Papers, 31(1), 165–179. doi:10.1007/BF02924687
- Wu, J. (2014). An unbiased two-parameter estimation with prior information in linear regression model. The Scientific World Journal, 2014, 1–8. doi:10.1155/2014/206943
- Wu, J., & Yang, H. (2013). Efficiency of an almost unbiased two-parameter estimator in linear regression model. Statistics, 47, 535–545. doi:10.1080/02331888.2011.605891
- Yang, H., & Chang, X. (2010). A new two-parameter estimator in linear regression. Communications in Statistics - Theory and Methods, 39, 923–934. doi:10.1080/03610920902807911