1,596
Views
2
CrossRef citations to date
0
Altmetric
Articles

Feature Screening for Massive Data Analysis by Subsampling

ORCID Icon, ORCID Icon, & ORCID Icon

References

  • Barut, E., Fan, J., and Verhasselt, A. (2016), “Conditional Sure Independence Screening,” Journal of the American Statistical Association, 111, 1266–1277.
  • Chang, X., Lin, S.-B., and Wang, Y. (2017), “Divide and Conquer Local Average Regression,” Electronic Journal of Statistics, 11, 1326–1350.
  • Cho, H. and Fryzlewicz, P. (2012), “High Dimensional Variable Selection via Tilting,” Journal of the Royal Statistical Society, Series B, 74, 593–622.
  • Elith, J., Graham, C. H., Anderson, R. P., Dudík, M., Ferrier, S., Guisan, A., Hijmans, R. J., Huettmann, F., Leathwick, J. R., Lehmann, A., et al. (2006), “Novel Methods Improve Prediction of Species Distributions from Occurrence Data,” Ecography, 29, 129–151.
  • Fan, J., Feng, Y., and Song, R. (2011), “Nonparametric Independence Screening in Sparse Ultra-High-Dimensional Additive Models,” Journal of the American Statistical Association, 106, 544–557.
  • Fan, J., Li, R., Zhang, C.-H., and Zou, H. (2020), Statistical Foundations of Data Science, Boca Raton, FL: CRC press.
  • Fan, J. and Lv, J. (2008), “Sure Independence Screening for Ultra-High Dimensional Feature Space” (with discussion), Journal of the Royal Statistical Society, Series B, 70, 849–911.
  • Fan, J., Song, R., et al. (2010), “Sure Independence Screening in Generalized Linear Models with NP-Dimensionality,” The Annals of Statistics, 38, 3567–3604.
  • Fan, J., Wang, D., Wang, K., and Zhu, Z. (2019), “Distributed Estimation of Principal Eigenspaces,” Annals of Statistics, 47, 3009.
  • Fan, Y., Kong, Y., Li, D., and Lv, J. (2016), “Interaction Pursuit With Feature Screening and Selection,” arXiv:1605.08933.
  • Hanley, J. A., and McNeil, B. J. (1982), “The Meaning and Use of the Area Under a Receiver Operating Characteristic (ROC) Curve,” Radiology, 143, 29–36.
  • Herlocker, J. L., Konstan, J. A., Terveen, L. G., and Riedl, J. T. (2004), “Evaluating Collaborative Filtering Recommender Systems,” ACM Transactions on Information Systems (TOIS), 22, 5–53.
  • Jordan, M. I., Lee, J. D., and Yang, Y. (2019), “Communication-Efficient Distributed Statistical Inference,” Journal of the American Statistical Association, 526, 668–681.
  • Kleiner, A., Talwalkar, A., Sarkar, P., and Jordan, M. I. (2014), “A Scalable Bootstrap for Massive Data,” Journal of the Royal Statistical Society, Series B, 76, 795–816.
  • Li, G., Peng, H., Zhang, J., Zhu, L., et al. (2012a), “Robust Rank Correlation Based Screening,” The Annals of Statistics, 40, 1846–1877.
  • Li, R., Zhong, W., and Zhu, L. (2012b), “Feature Screening via Distance Correlation Learning,” Journal of the American Statistical Association, 107, 1129–1139.
  • Li, X., Li, R., Xia, Z., and Xu, C. (2020), “Distributed Feature Screening via Componentwise Debiasing.” Journal of Machine Learning Research, 21, 1–32.
  • Ma, P., Mahoney, M. W., and Yu, B. (2015), “A Statistical Perspective on Algorithmic Leveraging,” The Journal of Machine Learning Research, 16, 861–911.
  • McKinney, S. M., Sieniek, M., Godbole, V., Godwin, J., Antropova, N., Ashrafian, H., Back, T., Chesus, M., Corrado, G. S., Darzi, A., et al. (2020), “International Evaluation of an AI System for Breast Cancer Screening,” Nature, 577, 89–94.
  • Pan, R., Wang, H., and Li, R. (2016), “Ultrahigh-Dimensional Multiclass Linear Discriminant Analysis by Pairwise sure Independence Screening,” Journal of the American Statistical Association, 111, 169–179.
  • Pan, R., Zhu, Y., Guo, B., Zhu, X., and Wang, H. (2020), “A Sequential Addressing Subsampling Method for Massive Data Analysis With Memory Constraint,” Working Paper.
  • Sengupta, S., Volgushev, S., and Shao, X. (2016), “A Subsampled Double Bootstrap for Massive Data,” Journal of the American Statistical Association, 111, 1222–1232.
  • Shamir, O., Srebro, N., and Zhang, T. (2014), “Communication-Efficient Distributed Optimization Using an Approximate Newton-Type Method,” in International Conference on Machine Learning, pp. 1000–1008.
  • Wang, H. (2009), “Forward Regression for Ultra-High Dimensional Variable Screening,” Journal of the American Statistical Association, 104, 1512–1524.
  • Wang, H. (2012), “Factor Profiled Sure Independence Screening,” Biometrika, 99, 15–28.
  • Wang, H. (2019), “More Efficient Estimation for Logistic Regression with Optimal Subsamples,” Journal of Machine Learning Research, 20, 1–59.
  • Wang, H., Yang, M., and Stufken, J. (2019), “Information-Based Optimal Subdata Selection for Big Data Linear Regression,” Journal of the American Statistical Association, 114, 393–405.
  • Wang, H., Zhu, R., and Ma, P. (2018), “Optimal Subsampling for Large Sample Logistic Regression,” Journal of the American Statistical Association, 113, 829–844.
  • Wang, L., Kim, Y., and Li, R. (2013), “Calibrating Non-Convex Penalized Regression in Ultra-High Dimension,” Annals of Statistics, 41, 2505.
  • Wu, Y. and Yin, G. (2015), “Conditional Quantile Screening in Ultrahigh-Dimensional Heterogeneous Data,” Biometrika, 102, 65–76.
  • Yu, J., Wang, H., Ai, M., and Zhang, H. (2020), “Optimal Distributed Subsampling for Maximum Quasi-Likelihood Estimators with Massive Data,” Journal of the American Statistical Association, 1–29, DOI: 10.1080/01621459.2020.1773832.
  • Zhou, M., Dai, M., Yao, Y., Liu, J., Yang, C., and Peng, H. (2019), “BOLT-SSI: A Statistical Approach to Screening Interaction Effects for Ultra-High Dimensional Data,” arXiv:1902.03525.
  • Zhou, T., Zhu, L., and Li, R. (2020), “Model-Free Forward Regression via Cumulative Divergence,” Journal of the American Statistical Association, 531, 1393–1405.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.