Abstract.
While plenty of frequentist model averaging methods have been proposed, existing weight selection criteria for generalized linear models (GLM) are usually based on a model size penalized Kullback-Leibler (KL) loss or simply cross-validation. In this article, when the data is generated from an exponential distribution, we propose a novel model averaging approach for GLM motivated by an asymptotically unbiased estimator of the KL loss penalized by an “effective model size” that incorporates the model misspecification. When all the candidate models are misspecified, the proposed method achieves asymptotic optimality while allowing both the number of candidate models and the dimension of covariates to diverging. Furthermore, when correct models are included in the candidate model set, we prove that the weight of wrong candidate models converges to zero, and hence the weighted regression coefficient estimator is consistent. Simulation studies and two real-data examples demonstrate the advantage of our new method over the existing frequentist model averaging methods.
Acknowledgments
We would like to thank the Editor (Esfandiar Maasoumi), the Associate Editor, and two anonymous referees for their constructive comments. We also thank the participants of “2022 Conference on the Frontiers of Model Averaging and Prediction Theory” for their discussions and suggestions.