Abstract
The family of sufficient dimension reduction (SDR) methods that produce informative combinations of predictors, or indices, are particularly useful for high-dimensional regression analysis. In many such analyses, it becomes increasingly common that there is available a priori subject knowledge of the predictors; for example, they belong to different groups. While many recent SDR proposals have greatly expanded the scope of the methods’ applicability, how to effectively incorporate the prior predictor structure information remains a challenge. In this article, we aim at dimension reduction that recovers full regression information while preserving the predictor group structure. Built upon a new concept of the direct sum envelope, we introduce a systematic way to incorporate the group information in most existing SDR estimators. As a result, the reduction outcomes are much easier to interpret. Moreover, the envelope method provides a principled way to build a variety of prior structures into dimension reduction analysis. Both simulations and real data analysis demonstrate the competent numerical performance of the new method.
Additional information
Notes on contributors
Zifang Guo
Zifang Guo, Merck & Co., Inc, 351 N Sumneytown Pike, North Wales, PA, 19454 (E-mail: [email protected]). Lexin Li, Division of Biostatistics, University of California, Berkeley, 344B Li Ka Shing Center, MC 3370, Berkeley, CA 94720 (E-mail: [email protected]). Wenbin Lu, Department of Statistics, North Carolina State University, 5214 SAS Hall, Raleigh, NC 27695 (E-mail: [email protected]). Bing Li, Department of Statistics, Pennsylvania State University, 326 Thomas Building, University Park, PA 16802 (E-mail: [email protected]). The authors are grateful to three referees and an Associate Editor for their many useful comments and suggestions, which have helped to greatly improve on an earlier manuscript. We also thank Dr. Brian J. Reich for helpful discussions. Lexin Li's research was partially supported by NSF grants DMS-1106668 and DMS-1310319. Wenbin Lu's research was supported by NIH/NCI grant R01 CA140632. Bing Li's research was supported in part by NSF grants DMS-1106815 and DMS-1407537.
Lexin Li
Zifang Guo, Merck & Co., Inc, 351 N Sumneytown Pike, North Wales, PA, 19454 (E-mail: [email protected]). Lexin Li, Division of Biostatistics, University of California, Berkeley, 344B Li Ka Shing Center, MC 3370, Berkeley, CA 94720 (E-mail: [email protected]). Wenbin Lu, Department of Statistics, North Carolina State University, 5214 SAS Hall, Raleigh, NC 27695 (E-mail: [email protected]). Bing Li, Department of Statistics, Pennsylvania State University, 326 Thomas Building, University Park, PA 16802 (E-mail: [email protected]). The authors are grateful to three referees and an Associate Editor for their many useful comments and suggestions, which have helped to greatly improve on an earlier manuscript. We also thank Dr. Brian J. Reich for helpful discussions. Lexin Li's research was partially supported by NSF grants DMS-1106668 and DMS-1310319. Wenbin Lu's research was supported by NIH/NCI grant R01 CA140632. Bing Li's research was supported in part by NSF grants DMS-1106815 and DMS-1407537.
Wenbin Lu
Zifang Guo, Merck & Co., Inc, 351 N Sumneytown Pike, North Wales, PA, 19454 (E-mail: [email protected]). Lexin Li, Division of Biostatistics, University of California, Berkeley, 344B Li Ka Shing Center, MC 3370, Berkeley, CA 94720 (E-mail: [email protected]). Wenbin Lu, Department of Statistics, North Carolina State University, 5214 SAS Hall, Raleigh, NC 27695 (E-mail: [email protected]). Bing Li, Department of Statistics, Pennsylvania State University, 326 Thomas Building, University Park, PA 16802 (E-mail: [email protected]). The authors are grateful to three referees and an Associate Editor for their many useful comments and suggestions, which have helped to greatly improve on an earlier manuscript. We also thank Dr. Brian J. Reich for helpful discussions. Lexin Li's research was partially supported by NSF grants DMS-1106668 and DMS-1310319. Wenbin Lu's research was supported by NIH/NCI grant R01 CA140632. Bing Li's research was supported in part by NSF grants DMS-1106815 and DMS-1407537.
Bing Li
Zifang Guo, Merck & Co., Inc, 351 N Sumneytown Pike, North Wales, PA, 19454 (E-mail: [email protected]). Lexin Li, Division of Biostatistics, University of California, Berkeley, 344B Li Ka Shing Center, MC 3370, Berkeley, CA 94720 (E-mail: [email protected]). Wenbin Lu, Department of Statistics, North Carolina State University, 5214 SAS Hall, Raleigh, NC 27695 (E-mail: [email protected]). Bing Li, Department of Statistics, Pennsylvania State University, 326 Thomas Building, University Park, PA 16802 (E-mail: [email protected]). The authors are grateful to three referees and an Associate Editor for their many useful comments and suggestions, which have helped to greatly improve on an earlier manuscript. We also thank Dr. Brian J. Reich for helpful discussions. Lexin Li's research was partially supported by NSF grants DMS-1106668 and DMS-1310319. Wenbin Lu's research was supported by NIH/NCI grant R01 CA140632. Bing Li's research was supported in part by NSF grants DMS-1106815 and DMS-1407537.