Abstract
Shrinkage estimates of small domain parameters typically use a combination of a noisy “direct” estimate that only uses data from a specific small domain and a more stable regression estimate. When the regression model is misspecified, estimation performance for the noisier domains can suffer due to substantial shrinkage toward a poorly estimated regression surface. In this article, we introduce a new class of robust, empirically-driven regression weights that target estimation of the small domain means under potential misspecification of the global regression model. Our regression weights are a convex combination of the model-based weights associated with the best linear unbiased predictor (BLUP) and those associated with the observed best predictor (OBP). The mixing parameter in this convex combination is found by minimizing a novel, unbiased estimate of the mean-squared prediction error for the small domain means, and we label the associated small domain estimates the “compromise best predictor,” or CBP. Using a data-adaptive mixture for the regression weights enables the CBP to preserve the robustness of the OBP while retaining the main advantages of the EBLUP whenever the regression model is correct. We demonstrate the use of the CBP in an application estimating gait speed in older adults. Supplementary materials for this article are available online.
Supplementary Materials
Appendix: Section A contains proofs of Proposition 1 and Theorems 1–2. Section B contains proofs of Theorems 3–4. Section C shows a derivation for the general unbiased MSPE estimator of Section 2.4 and a derivation for the unbiased estimator of the population mean-MSPE of Section 3. Section D contains a few additional derivations, and Section E contains values of the stratum-specific estimates from the gait speed application. Section F describes an unbiased estimator of the MSPE that can be used to find compromise regression weights in the context of the nested-error regression model. (PDF file)
Replication files: This zip file contains the R code needed to reproduce the simulation results described in Section 5, and it contains the R code used for the application described in Section 6. (Zip file)
R package: An R package entitled shrinkcbp which implements the methods discussed in this article. (GNU zipped tar file). This R package may also be retrieved from https://github.com/nchenderson/shrinkcbp.