2,089
Views
2
CrossRef citations to date
0
Altmetric
Theory and Methods

A General M-estimation Theory in Semi-Supervised Framework

, ORCID Icon &
Pages 1065-1075 | Received 11 Oct 2021, Accepted 04 Jan 2023, Published online: 28 Feb 2023

References

  • Ando, R. K., and Zhang, T. (2005), “A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data,” Journal of Machine Learning Research, 6, 1817–1853.
  • Ando, R. K., and Zhang, T. (2007), “Two-View Feature Generation Model for Semi-supervised Learning,” in Proceedings of the 24th International Conference on Machine Learning, eds. Z. Ghahramani, pp. 25–32, New York: Association for Computing Machinery. DOI: 10.1145/1273496.1273500.
  • Angrist, J., Chernozhukov, V., and Fernández-Val, I. (2006), “Quantile Regression under Misspecification, with an Application to the US Wage Structure,” Econometrica, 74, 539–563. DOI: 10.1111/j.1468-0262.2006.00671.x.
  • Azriel, D., Brown, L. D., Sklar, M., Berk, R., Buja, A., and Zhao, L. (2021), “Semi-supervised Linear Regression,” Journal of the American Statistical Association, 117, 2238–2251. DOI: 10.1080/01621459.2021.1915320.
  • Bai, Z., Rao, C. R., and Wu, Y. (1992), “M-estimation of Multivariate Linear Regression Parameters under a Convex Discrepancy Function,” Statistica Sinica, 2, 237–54.
  • Begg, M. D., and Lagakos, S. (1990), “On the Consequences of Model Misspecification in Logistic Regression,” Environmental Health Perspectives, 87, 69–75. DOI: 10.1289/ehp.908769.
  • Belkin, M., Niyogi, P., and Sindhwani, V. (2006), “Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples,” Journal of Machine Learning Research, 7, 2399–2434.
  • Buja, A., Berk, R., Brown, L. D., George, E., Kuchibhotla, A. K., and Zhao, L. (2016), “Models as Approximations Part II: A General Theory of Model-Robust Regression,” arXiv:1612.03257.
  • Buja, A., Brown, L. D., Berk, R., George, E., Pitkin, E., Traskin, M., Zhang, K., and Zhao, L. (2019), “Models as Approximations I: Consequences Illustrated with Linear Regression,” Statistical Science, 34, 523–544. DOI: 10.1214/18-STS693.
  • Cai, T. T., and Guo, Z. (2020), “Semisupervised Inference for Explained Variance in High Dimensional Linear Regression and its Applications,” Journal of the Royal Statistical Society, Series B, 82, 391–419. DOI: 10.1111/rssb.12357.
  • Chakrabortty, A., and Cai, T. (2018), “Efficient and Adaptive Linear Regression in Semi-supervised Settings,” The Annals of Statistics, 46, 1541–1572. DOI: 10.1214/17-AOS1594.
  • Chakrabortty, A., Dai, G., and Carroll, R. J. (2022), “Semi-supervised Quantile Estimation: Robust and Efficient Inference in High Dimensional Settings,” arXiv:2201.10208.
  • Chapelle, O., Schölkopf, B., and Zien, A. (2009), Semi-supervised Learning, Cambridge, MA: MIT Press.
  • Cheng, D., Ananthakrishnan, A., and Cai, T. (2018), “Efficient and Robust Semi-supervised Estimation of Average Treatment Effects in Electronic Medical Records Data,” arXiv:1804.00195.
  • Cheng, G., and Huang, J. Z. (2010), “Bootstrap Consistency for General Semiparametric M-estimation,” The Annals of Statistics, 38(5):2884–915. DOI: 10.1214/10-AOS809.
  • Cozman, F. G., Cohen, I., and Cirelo, M. C. (2003), “Semi-supervised Learning of Mixture Models,” in Proceedings of the 20th International Conference on Machine Learning, eds. T. Fawcett and N. Mishra, pp. 99–106, Menlo Park, CA: AAAI Press.
  • Deng, S., Ning, Y., Zhao, J., and Zhang, H. (2020), “Optimal Semi-supervised Estimation and Inference for High-dimensional Linear Regression,” arXiv:2011.14185.
  • Dong, C., Li, G., and Feng, X. (2019), “Lack-of-Fit Tests for Quantile Regression Models,” Journal of the Royal Statistical Society, Series B, 81, 629–648. DOI: 10.1111/rssb.12321.
  • Grandvalet, Y., and Bengio, Y. (2004), “Semi-supervised Learning by Entropy Minimization,” in Advances in Neural Information Processing Systems, eds. Y. Weiss, B. Schölkopf, and J. Platt, pp. 529–36, Cambridge, MA: MIT Press.
  • Gronsbell, J. L., and Cai, T. (2018), “Semi-supervised Approaches to Efficient Evaluation of Model Prediction Performance,” Journal of the Royal Statistical Society, Series B, 80, 579–594. DOI: 10.1111/rssb.12264.
  • He, X., and Zhu, L.-X. (2003), “A Lack-of-Fit Test for Quantile Regression,” Journal of the American Statistical Association, 98, 1013–1022. DOI: 10.1198/016214503000000963.
  • Henmi, M., Yoshida, R., and Eguchi, S. (2007), “Importance Sampling via the Estimated Sampler,” Biometrika, 94, 985–991. DOI: 10.1093/biomet/asm076.
  • Johnson, R., and Zhang, T. (2008), “Graph-based Semi-supervised Learning and Spectral Kernel Design,” IEEE Transactions on Information Theory, 54, 275–288. DOI: 10.1109/TIT.2007.911294.
  • Kawakita, M., and Kanamori, T. (2013), “Semi-supervised Learning with Density-Ratio Estimation,” Machine Learning, 91, 189–209. DOI: 10.1007/s10994-013-5329-8.
  • Kim, T.-H., and White, H. (2003), “Estimation, Inference, and Specification Testing for Possibly Misspecified Quantile Regression,” Advances in Econometrics, 17, 107–132.
  • Koenker, R., and Bassett, G. (1978), “Regression Quantiles,” Econometrica, 46, 33–50. DOI: 10.2307/1913643.
  • Koenker, R., and Portnoy, S. (1990), “M-estimation of Multivariate Regressions,” Journal of the American Statistical Association, 85, 1060–1068. DOI: 10.2307/2289602.
  • Kriegler, B., and Berk, R. (2010), “Small Area Estimation of the Homeless in Los Angeles: An Application of Cost-Sensitive Stochastic Gradient Boosting,” The Annals of Applied Statistics, 4, 1234–1255. DOI: 10.1214/10-AOAS328.
  • Lv, J., and Liu, J. S. (2014), “Model Selection Principles in Misspecified Models,” Journal of the Royal Statistical Society, Series B, 76, 141–167. DOI: 10.1111/rssb.12023.
  • Nigam, K., McCallum, A. K., Thrun, S., and Mitchell, T. (2000), “Text Classification from Labeled and Unlabeled Documents using EM,” Machine Learning, 39, 103–134. DOI: 10.1023/A:1007692713085.
  • Sokolovska, N., Cappé, O., and Yvon, F. (2008), “The Asymptotics of Semi-supervised Learning in Discriminative Probabilistic Models,” in Proceedings of the 25th International Conference on Machine Learning, eds. W. Cohen, pp. 984–991, New York: Association for Computing Machinery. DOI: 10.1145/1390156.1390280.
  • van der Vaart, A. W. (2000), Asymptotic Statistics, Cambridge, UK: Cambridge University Press.
  • Wang, H. J., McKeague, I. W., and Qian, M. (2018), “Testing for Marginal Linear Effects in Quantile Regression,” Journal of the Royal Statistical Society, Series B, 80, 433–452. DOI: 10.1111/rssb.12258.
  • Wang, J., and Shen, X. (2007), “Large Margin Semi-Supervised Learning,” Journal of Machine Learning Research, 8, 1867–1891.
  • Wang, J., Shen, X., and Liu, Y. (2008), “Probability Estimation for Large-Margin Classifiers,” Biometrika, 95, 149–67. DOI: 10.1093/biomet/asm077.
  • Wasserman, L., and Lafferty, J. (2007), “Statistical Analysis of Semi-supervised Regression,” in Advances in Neural Information Processing Systems (Vol. 20), eds. J. C. Platt, D. Koller, Y. Singer, and S. T. Roweis, pp. 801–808, Cambridge, MA: MIT Press.
  • White, H. (1981), “Consequences and Detection of Misspecified Nonlinear Regression Models,” Journal of the American Statistical Association, 76, 419–433. DOI: 10.1080/01621459.1981.10477663.
  • White, H. (1982), “Maximum Likelihood Estimation of Misspecified Models,” Econometrica, 50, 1–25. DOI: 10.2307/1912526.
  • Zeng, D., and Lin, D. Y. (2008), “Efficient Resampling Methods for Nonsmooth Estimating Functions,” Biostatistics, 9, 355–363. DOI: 10.1093/biostatistics/kxm034.
  • Zhang, A., Brown, L. D., and Cai, T. T. (2019), “Semi-supervised Inference: General Theory and Estimation of Means,” The Annals of Statistics, 47, 2538–2566. DOI: 10.1214/18-AOS1756.
  • Zhang, T., and Oles, F. (2000), “The Value of Unlabeled Data for Classification Problems,” in Proceedings of the 17th International Conference on Machine Learning (Vol. 20), ed. P. Langley, pp. 1191–1198, San Francisco, CA: Morgan Kaufmann.
  • Zhou, Q. M., Song, P. X.-K., and Thompson, M. E. (2012), “Information Ratio Test for Model Misspecification in Quasi-Likelihood Inference,” Journal of the American Statistical Association, 107, 205–213. DOI: 10.1080/01621459.2011.645785.
  • Zhu, X., and Goldberg, A. B. (2009), “Introduction to Semi-supervised Learning,” Synthesis Lectures on Artificial Intelligence and Machine Learning, 3, 1–130. DOI: 10.2200/S00196ED1V01Y200906AIM006.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.