ABSTRACT
In quantile linear regression with ultrahigh-dimensional data, we propose an algorithm for screening all candidate variables and subsequently selecting relevant predictors. Specifically, we first employ quantile partial correlation for screening, and then we apply the extended Bayesian information criterion (EBIC) for best subset selection. Our proposed method can successfully select predictors when the variables are highly correlated, and it can also identify variables that make a contribution to the conditional quantiles but are marginally uncorrelated or weakly correlated with the response. Theoretical results show that the proposed algorithm can yield the sure screening set. By controlling the false selection rate, model selection consistency can be achieved theoretically. In practice, we proposed using EBIC for best subset selection so that the resulting model is screening consistent. Simulation studies demonstrate that the proposed algorithm performs well, and an empirical example is presented. Supplementary materials for this article are available online.
Supplementary Materials
In the supplementary materials, we provide the proofs for Lemmas A.1–A.5, and present additional simulation results for Examples 1–3.
Acknowledgments
We are grateful to the editor, the associate editor, and three anonymous reviewers for their constructive comments that helped us improve the article substantially.
Funding
Ma’s research was supported by NSF grant DMS 1306972 and Hellman Fellowship. Li’s research was supported by NSF grant DMS 1512422 and NIDA, NIH grants P50 DA036107 and P50 DA039838. The content is solely the responsibility of the authors and does not necessarily represent the official views of NSF, NIDA, or NIH.