1,645
Views
2
CrossRef citations to date
0
Altmetric
Theory and Methods

A Model-free Variable Screening Method Based on Leverage Score

, &
Pages 135-146 | Received 25 Dec 2019, Accepted 05 Apr 2021, Published online: 21 Jun 2021

References

  • Adams, B. D., Claffey, K. P., and White, B. A. (2009), “Argonaute-2 Expression is Regulated by Epidermal Growth Factor Receptor and Mitogen-Activated Protein Kinase Signaling and Correlates With a Transformed Phenotype in Breast Cancer Cells,” Endocrinology, 150, 14–23. DOI: 10.1210/en.2008-0984.
  • Allen-Zhu, Z., and Li, Y. (2016), “LazySVD: Even Faster SVD Decomposition Yet Without Agonizing Pain,” in Advances in Neural Information Processing Systems, eds. D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, and R. Garnett, Curran Associates, Inc. pp. 974–982. https://proceedings.neurips.cc/paper/2016/file/c6e19e830859f2cb9f7c8f8cacb8d2a6-Paper.pdf
  • Antonyak, M. A., Miller, A. M., Jansen, J. M., Boehm, J. E., Balkman, C. E., Wakshlag, J. J., Page, R. L., and Cerione, R. A. (2004), “Augmentation of Tissue Transglutaminase Expression and Activation by Epidermal Growth Factor Inhibit Doxorubicin-induced Apoptosis in Human Breast Cancer Cells,” Journal of Biological Chemistry, 279, 41461–41467. DOI: 10.1074/jbc.M404976200.
  • Breiman, L. (1995), “Better Subset Regression Using the Nonnegative Garrote,” Technometrics, 37, 373–384. DOI: 10.1080/00401706.1995.10484371.
  • Chen, J., and Chen, Z. (2008), “Extended Bayesian Information Criteria for Model Selection With Large Model Spaces,” Biometrika, 95, 759–771. DOI: 10.1093/biomet/asn034.
  • Cook, D. (1995). “An Introduction to Regression Graphics,” Journal of the American Statistical Association, 90, 1126–1128. DOI: 10.2307/2291355.
  • Cook, R. D. (1996), “Graphics for Regressions With a Binary Response,” Journal of the American Statistical Association, 91, 983–992. DOI: 10.1080/01621459.1996.10476968.
  • ——— (1998), “Principal Hessian Directions Revisited,” Journal of the American Statistical Association, 93, 84–94.
  • ——— (2004), “Testing Predictor Contributions in Sufficient Dimension Reduction,” The Annals of Statistics, 32, 1062–1092.
  • Dasgupta, A., Drineas, P., Harb, B., Josifovski, V., and Mahoney, M. W. (2007), “Feature Selection Methods for Text Classification,” in Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 230–239. New York: ACM. DOI: 10.1145/1281192.1281220.
  • Drineas, P., Kannan, R., and Mahoney, M. W. (2006), “Fast Monte Carlo Algorithms for Matrices III: Computing a Compressed Approximate Matrix Decomposition,” SIAM Journal on Computing, 36, 184–206. DOI: 10.1137/S0097539704442702.
  • Drineas, P., Mahoney, M. W., and Muthukrishnan, S. (2008), “Relative-error CUR Matrix Decompositions,” SIAM Journal on Matrix Analysis and Applications, 30, 844–881. DOI: 10.1137/07070471X.
  • Duan, N., and Li, K.-C. (1991), “Slicing Regression: A Link-free Regression Method,” The Annals of Statistics, 19, 505–530. DOI: 10.1214/aos/1176348109.
  • Efroymson, M. (1960), “Multiple Regression Analysis,” Mathematical Methods for Digital Computers, 1, 191–203.
  • Fan, J., Feng, Y., and Song, R. (2011), “Nonparametric Independence Screening in Sparse ultra-high-dimensional Additive Models,” Journal of the American Statistical Association, 106, 544–557. DOI: 10.1198/jasa.2011.tm09779.
  • Fan, J., and Li, R. (2001), “Variable Selection Via Nonconcave Penalized Likelihood and Its Oracle Properties,” Journal of the American Statistical Association, 96, 1348–1360. DOI: 10.1198/016214501753382273.
  • Fan, J., and Lv, J. (2008), “Sure Independence Screening for Ultrahigh Dimensional Feature Space,” Journal of the Royal Statistical Society, Series B, 70, 849–911. DOI: 10.1111/j.1467-9868.2008.00674.x.
  • ——— (2010), “A Selective Overview of Variable Selection in High Dimensional Feature Space,” Statistica Sinica, 20, 101–148.
  • Fan, J., Samworth, R., and Wu, R. (2009), “Ultrahigh Dimensional Feature Selection: Beyond the Linear Model,” The Journal of Machine Learning Research, 10, 2013–2038.
  • Fan, J., and Wang, W. (2015), “Asymptotics of Empirical Eigen-structure for Ultra-high Dimensional Spiked Covariance Model,” arXiv:1502.04733.
  • Gallant, A. R., Rossi, P. E., and Tauchen, G. (1993), “Nonlinear Dynamic Structures. Econometrica: Journal of the Econometric Society, 61, 871–907. DOI: 10.2307/2951766.
  • Hall, P., and Miller, H. (2009), “Using Generalized Correlation to Effect Variable Selection in Very High Dimensional Problems,” Journal of Computational and Graphical Statistics, 18, 533–550. DOI: 10.1198/jcgs.2009.08041.
  • Hall, P., Titterington, D., and Xue, J.-H. (2009), “Tilting Methods for Assessing the Influence of Components in a Classifier,” Journal of the Royal Statistical Society, Series B, 71, 783–803. DOI: 10.1111/j.1467-9868.2009.00701.x.
  • Huang, J., Horowitz, J. L., and Ma, S. (2008), “Asymptotic Properties of Bridge Estimators in Sparse High-dimensional Regression Models,” Annals of Statistics, 36, 587–613.
  • Johnstone, I. M. (2001), “On the Distribution of the Largest Eigenvalue in Principal Components Analysis,” Annals of Statistics, 29, 295–327.
  • Kondo, N., Toyama, T., Sugiura, H., Fujii, Y., and Yamashita, H. (2008), “miR-206 Expression is Down-regulated in Estrogen Receptor α–positive Human Breast Cancer,” Cancer Research, 68, 5004–5008. DOI: 10.1158/0008-5472.CAN-08-0180.
  • Li, K.-C. (1991), “Sliced Inverse Regression for Dimension Reduction,” Journal of the American Statistical Association, 86, 316–327. DOI: 10.1080/01621459.1991.10475035.
  • Li, R., Zhong, W., and Zhu, L. (2012), “Feature Screening Via Distance Correlation Learning,” Journal of the American Statistical Association, 107, 1129–1139. DOI: 10.1080/01621459.2012.695654.
  • Liu, X., Ma, Y., Yang, W., Wu, X., Jiang, L., and Chen, X. (2015), “Identification of Therapeutic Targets for Breast Cancer Using Biological Informatics Methods,” Molecular Medicine Reports, 12, 1789–1795. DOI: 10.3892/mmr.2015.3565.
  • Love, M. I., Huber, W., and Anders, S. (2014), “Moderated Estimation of Fold Change and Dispersion for RNA-seq Data With DESeq2,” Genome Biology, 15, 550. DOI: 10.1186/s13059-014-0550-8.
  • Ma, P., Mahoney, M., and Yu, B. (2014), “A Statistical Perspective on Algorithmic Leveraging,” in International Conference on Machine Learning, Beijing, China, eds. E. P. Xing and T. Jebara, PMLR, pp. 91–99. http://proceedings.mlr.press/v32/ma14.pdf
  • Ma, P., and Sun, X. (2015), “Leveraging for Big Data Regression,” Wiley Interdisciplinary Reviews: Computational Statistics, 7, 70–76. DOI: 10.1002/wics.1324.
  • Mahoney, M. W., and Drineas, P. (2009), “CUR Matrix Decompositions for Improved Data Analysis,” Proceedings of the National Academy of Sciences, 106, 697–702. DOI: 10.1073/pnas.0803205106.
  • Mahoney, M. W., Maggioni, M., and Drineas, P. (2008), “Tensor-CUR Decompositions for Tensor-based Data,” SIAM Journal on Matrix Analysis and Applications, 30, 957–987. DOI: 10.1137/060665336.
  • Musco, C., and Musco, C. (2015), “Randomized Block Krylov Methods for Stronger and Faster Approximate Singular Value Decomposition,” in Advances in Neural Information Processing Systems, eds. C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett, Curran Associates, Inc., pp. 1396–1404. https://proceedings.neurips.cc/paper/2015/file/1efa39bcaec6f3900149160693694536-Paper.pdf
  • Mutlu, P., A. Ural, U., and Gündüz, U. (2012), “Differential Gene Expression Analysis Related to Extracellular Matrix Components in Drug-resistant RPMI-8226 Cell Line,” Biomedicine & Pharmacotherapy, 66, 228–231.
  • Pandolfi, P. P. (2004), “Aberrant mRNA Translation in Cancer Pathogenesis: An Old Concept Revisited Comes Finally of Age,” Oncogene, 23, 3134–3137. DOI: 10.1038/sj.onc.1207618.
  • Ravikumar, P., Lafferty, J., Liu, H., and Wasserman, L. (2009), “Sparse Additive Models,” Journal of the Royal Statistical Society, Series B, 71, 1009–1030. DOI: 10.1111/j.1467-9868.2009.00718.x.
  • Shamir, O. (2016), “Fast Stochastic Algorithms for SVD and PCA: Convergence Properties and Convexity,” in International Conference on Machine Learning, eds. M. F. Balcan and K. Q. Weinberger, PMLR, pp. 248–256. http://proceedings.mlr.press/v48/shamira16.pdf
  • Shen, D., Shen, H., and Marron, J. (2014), “A General Framework for Consistency of Principal Component Analysis,” Journal of Machine Learning Research, 17, 1–34.
  • Shen, D., Shen, H., Zhu, H., and Marron, J. (2016), “The Statistics and Mathematics of High Dimension Low Sample Size Asymptotics,” Statistica Sinica, 26, 1747.
  • Siddiqui, N., and Borden, K. L. (2012), “mRNA Export and Cancer,” Wiley Interdisciplinary Reviews: RNA, 3, 13–25. DOI: 10.1002/wrna.101.
  • Ståhl, P. L., F. Salmén, S. Vickovic, A. Lundmark, J. F. Navarro, J. Magnusson, S. Giacomello, M. Asp, J. O. Westholm, M. Huss, A. Mollbrink, S. Linnarsson, S. Codeluppi, Å. Borg, F. Pontén, P. I. Costea, P. Sahlén, J. Mulder, O. Bergmann, J. Lundeberg, and J. Frisén (2016), “Visualization and Analysis of Gene Expression in Tissue Sections by Spatial Transcriptomics,” Science, 353, 78–82. DOI: 10.1126/science.aaf2403.
  • Stewart, G. (1998), “Four Algorithms for the Efficient Computation of Truncated Pivoted QR Approximation to a Sparse Matrix. CS report,” Technical Report, TR-98-12. College Park, MD: University of Maryland.
  • Tibshirani, R. (1996), “Regression Shrinkage and Selection Via the Lasso,” Journal of the Royal Statistical Society, Series B, 58, 267–288. DOI: 10.1111/j.2517-6161.1996.tb02080.x.
  • Wang, H. (2009), “Forward Regression for Ultra-high Dimensional Variable Screening,” Journal of the American Statistical Association, 104, 1512–1524. DOI: 10.1198/jasa.2008.tm08516.
  • Wu, L., and Qu, X. (2015), “Cancer Biomarker Detection: Recent Achievements and Challenges,” Chemical Society Reviews, 44, 2963–2997. DOI: 10.1039/c4cs00370e.
  • Yuan, M., and Y. Lin (2007), “On the Non-negative Garrotte Estimator,” Journal of the Royal Statistical Society, Series B, 69, 143–161. DOI: 10.1111/j.1467-9868.2007.00581.x.
  • Zeng, P., and Zhu, Y. (2010), “An Integral Transform Method for Estimating the Central Mean and Central Subspaces,” Journal of Multivariate Analysis, 101, 271–290. DOI: 10.1016/j.jmva.2009.08.004.
  • Zhong, W., Zhang, T., Zhu, Y., and Liu, J. S. (2012), “Correlation Pursuit: Forward Stepwise Variable Selection for Index Models,” Journal of the Royal Statistical Society, Series B, 74, 849–870. DOI: 10.1111/j.1467-9868.2011.01026.x.
  • Zhou, T., Zhu, L., Xu, C., and Li, R. (2020), “Model-free Forward Screening Via Cumulative Divergence,” Journal of the American Statistical Association, 115, 1393–1405. DOI: 10.1080/01621459.2019.1632078.
  • Zhu, L., Miao, B., and Peng, H. (2006), “On Sliced Inverse Regression With High-dimensional Covariates,” Journal of the American Statistical Association, 101, 630–643. DOI: 10.1198/016214505000001285.
  • Zhu, L.-P., Li, L., Li, R., and Zhu, L.-X. (2011), “Model-free Feature Screening for Ultrahigh-dimensional Data,” Journal of the American Statistical Association, 106, 1464–1475. DOI: 10.1198/jasa.2011.tm10563.
  • Zou, H., and Hastie, T. (2005), “Regularization and Variable Selection Via the Elastic Net,” Journal of the Royal Statistical Society, 67, 301–320. DOI: 10.1111/j.1467-9868.2005.00503.x.
  • Zou, H., and Li, R. (2008), “One-step Sparse Estimates in Nonconcave Penalized Likelihood Models,” Annals of Statistics, 36, 1509.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.