577
Views
12
CrossRef citations to date
0
Altmetric
Articles

Optimal Sampling for Generalized Linear Models Under Measurement Constraints

, &
Pages 106-114 | Received 18 Jul 2019, Accepted 01 Jun 2020, Published online: 20 Jul 2020

References

  • Ai, M., Yu, J., Zhang, H., and Wang, H. (2018), “Optimal Subsampling Algorithms for Big Data Regressions,” arXiv no. 1806.06761.
  • Aitkin, M. A., Aitkin, M., Francis, B., and Hinde, J. (2005), Statistical Modelling in GLIM 4 (Vol. 32), Oxford: OUP.
  • Banerji, M., Lahav, O., Lintott, C. J., Abdalla, F. B., Schawinski, K., Bamford, S. P., Andreescu, D., Murray, P., Raddick, M. J., Slosar, A., and Szalay, A. (2010), “Galaxy Zoo: Reproducing Galaxy Morphologies via Machine Learning,” Monthly Notices of the Royal Astronomical Society, 406, 342–353. DOI: 10.1111/j.1365-2966.2010.16713.x.
  • Cai, T. T., and Guo, Z. (2018), “Semi-Supervised Inference for Explained Variance in High-Dimensional Linear Regression and Its Applications,” arXiv no. 1806.06179.
  • Chakrabortty, A., and Cai, T. (2018), “Efficient and Adaptive Linear Regression in Semi-Supervised Settings,” The Annals of Statistics, 46, 1541–1572. DOI: 10.1214/17-AOS1594.
  • Chapelle, O., Schlkopf, B., and Zien, A. (2010), Semi-Supervised Learning (1st ed.), Cambridge, MA: The MIT Press.
  • Drineas, P., Magdon-Ismail, M., Mahoney, M. W., and Woodruff, D. P. (2012), “Fast Approximation of Matrix Coherence and Statistical Leverage,” Journal of Machine Learning Research, 13, 3475–3506.
  • Drineas, P., and Mahoney, M. W. (2016), “RandNLA: Randomized Numerical Linear Algebra,” Communications of the ACM, 59, 80–90. DOI: 10.1145/2842602.
  • Drineas, P., Mahoney, M. W., and Muthukrishnan, S. (2006), “Sampling Algorithms for l2 Regression and Applications,” in Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithm, SODA ’06, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, pp. 1127–1136.
  • Drineas, P., Mahoney, M. W., Muthukrishnan, S., and Sarlós, T. (2011), “Faster Least Squares Approximation,” Numerische Mathematik, 117, 219–249. DOI: 10.1007/s00211-010-0331-6.
  • Hall, P., and Heyde, C. C. (2014), Martingale Limit Theory and Its Application, New York: Academic Press.
  • Hamidieh, K. (2018), “A Data-Driven Statistical Model for Predicting the Critical Temperature of a Superconductor,” Computational Materials Science, 154, 346–354. DOI: 10.1016/j.commatsci.2018.07.052.
  • Huber, P. (2004), Robust Statistics, Wiley Series in Probability and Statistics—Applied Probability and Statistics Section Series, Chichester: Wiley.
  • Khuri, A. I., Mukherjee, B., Sinha, B. K., and Ghosh, M. (2006), “Design Issues for Generalized Linear Models: A Review,” Statistical Science, 21, 376–399. DOI: 10.1214/088342306000000105.
  • Kiefer, J. (1959), “Optimum Experimental Designs,” Journal of the Royal Statistical Society, Series B, 21, 272–304. DOI: 10.1111/j.2517-6161.1959.tb00338.x.
  • Ma, P., Mahoney, M. W., and Yu, B. (2015), “A Statistical Perspective on Algorithmic Leveraging,” The Journal of Machine Learning Research, 16, 861–911.
  • McCullagh, P., and Nelder, J. (1989), Generalized Linear Models, Chapman and Hall/CRC Monographs on Statistics and Applied Probability Series (2nd ed.), London: Chapman & Hall.
  • Pukelsheim, F. (2006), Optimal Design of Experiments, Philadelphia, PA: SIAM.
  • Raskutti, G., and Mahoney, M. W. (2016), “A Statistical Perspective on Randomized Sketching for Ordinary Least-Squares,” The Journal of Machine Learning Research, 17, 7508–7538.
  • Reiman, D. M., and Göhre, B. E. (2019), “Deblending Galaxy Superpositions With Branched Generative Adversarial Networks,” Monthly Notices of the Royal Astronomical Society, 485, 2617–2627. DOI: 10.1093/mnras/stz575.
  • Rousseeuw, P. J., and Hubert, M. (2011), “Robust Statistics for Outlier Detection,” Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1, 73–79. DOI: 10.1002/widm.2.
  • Ting, D., and Brochu, E. (2018), “Optimal Subsampling With Influence Functions,” in Advances in Neural Information Processing Systems, pp. 3650–3659.
  • van der Vaart, A. W. (2000), Asymptotic Statistics (Vol. 3), Cambridge: Cambridge University Press.
  • Wang, H., Yang, M., and Stufken, J. (2019), “Information-Based Optimal Subdata Selection for Big Data Linear Regression,” Journal of the American Statistical Association, 114, 393–405. DOI: 10.1080/01621459.2017.1408468.
  • Wang, H., Zhu, R., and Ma, P. (2018), “Optimal Subsampling for Large Sample Logistic Regression,” Journal of the American Statistical Association, 113, 829–844. DOI: 10.1080/01621459.2017.1292914.
  • Wang, Y., Yu, A. W., and Singh, A. (2017), “On Computationally Tractable Selection of Experiments in Measurement-Constrained Regression Models,” The Journal of Machine Learning Research, 18, 5238–5278.
  • Xu, P., Yang, J., Roosta, F., Ré, C., and Mahoney, M. W. (2016), “Sub-Sampled Newton Methods With Non-Uniform Sampling,” in Advances in Neural Information Processing Systems, pp. 3000–3008.
  • Zhang, A., Brown, L. D., and Cai, T. T. (2016), “Semi-Supervised Inference: General Theory and Estimation of Means,” arXiv no. 1606.07268.
  • Zhu, X. J. (2005), “Semi-Supervised Learning Literature Survey,” Technical Report, University of Wisconsin-Madison Department of Computer Sciences.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.