References
- Ai, M., J. Yu, H. Zhang, and H. Wang. 2021. Optimal subsampling algorithms for big data regressions. Statistica Sinica 31:749–72. doi:10.5705/ss.202018.0439.
- Cheung, C., H. Peng, and L. Rubchinsky. 2019. A-optimal subsampling for big data general estimating equations. Manuscript. https://scholarworks.iupui.edu/handle/1805/20022.
- Dey, A., and R. Mukerjee. 1999. Fractional factorial plans. New York: John Wiley and Sons.
- Dhillon, P. S., Y. Lu, D. Foster, and L. Ungar. 2013. New subsampling algorithms for fast least squares regression. Advances in Neural Information Processing Systems 32:360–8.
- Drovandi, C. C., C. C. Holmes, J. M. McGree, K. Mengersen, S. Richardson, and E. G. Ryan. 2017. Principles of experimental design for big data analysis. Statistical Science 32 (3):385–404. doi:10.1214/16-STS604.
- Fan, J., F. Han, and H. Liu. 2014. Challenges of big data analysis. National Science Review 1 (2):293–314. doi:10.1093/nsr/nwt032.
- Kiefer, J. C. 1959. Optimum experimental designs. Journal of the Royal Statistical Society: Series B (Methodological) 21 (2):272–319. doi:10.1111/j.2517-6161.1959.tb00338.x.
- Ma, P., M. Mahoney, and B. Yu. 2015. A statistical perspective on algorithmic leveraging. Statistics and Its Interface 16:861–911.
- Meeker, W. Q., and Y. Hong. 2014. Reliability meets big data: opportunities and challenges. Quality Engineering 26 (1):102–16. doi:10.1080/08982112.2014.846119.
- Pinar, T. 2014. Prediction of full load electrical power output of a base load operated combined cycle power plant using machine learning methods. International Journal of Electrical Power and Energy Systems 60:126–40.
- Schifano, E. D., J. Wu, C. Wang, J. Yan, and M.-H. Chen. 2016. Online updating of statistical inference in the big data setting. Technometrics: A Journal of Statistics for the Physical, Chemical, and Engineering Sciences 58 (3):393–403. doi:10.1080/00401706.2016.1142900.
- Wang, H. 2019. Divide-and-conquer information-based optimal subdata selection algorithm. Journal of Statistical Theory and Practice 13 (3):114. doi:10.1007/s42519–019–0048–5.
- Wang, L., J. Elmstedt, W. K. Wong, and H. Xu. 2021. Orthogonal subsampling for big data linear regression. Annals of Applied Statistics. 15 (3):1273–1290. doi:10.1214/21-AOAS1462.
- Wang, H., M. Yang, and J. Stufken. 2019. Information-based optimal subdata selection for big data linear regression. Journal of the American Statistical Association 114 (525):393–405. doi:10.1080/01621459.2017.1408468.
- Wang, H., R. Zhu, and P. Ma. 2018. Optimal subsampling for large sample logistic regression. Journal of the American Statistical Association 113 (522):829–44. doi:10.1080/01621459.2017.1292914.
- Xi, B., H. Chen, W. S. Cleveland, and T. Telkamp. 2010. Statistical analysis and modelling of internet voip traffic for network engineering. Electronic Journal of Statistics 4:58–116. doi:10.1214/09-EJS473.
- Yamada, S., and D. K. Lin. 1999. Three-level super saturated designs. Statistics & Probability Letters 45 (1):31–9. doi:10.1016/S0167-7152(99)00038-3.
- Yu, J., H. Wang, M. Ai, and H. Zhang. 2020. Optimal distributed subsampling for maximum quasi-likelihood estimators with massive data. Journal of the American Statistical Association. Published online. doi:10.1080/01621459.2020.1773832.