Search in:

Journal of the American Statistical Association Volume 119, 2024 - Issue 546

Submit an article Journal homepage

2,089

Views

CrossRef citations to date

Altmetric

Theory and Methods

A General M-estimation Theory in Semi-Supervised Framework

Shanshan Songa Department of Statistics, The Chinese University of Hong Kong, Hong Kong, ChinaView further author information

Yuanyuan Lina Department of Statistics, The Chinese University of Hong Kong, Hong Kong, ChinaCorrespondence[email protected]

https://orcid.org/0000-0003-1293-1040 View further author information

Yong Zhoub KLATASDS-MOE, School of Statistics and Academy of Statistics and Interdisciplinary Sciences, East China Normal University, Shanghai, ChinaCorrespondence[email protected]
View further author information

Pages 1065-1075 | Received 11 Oct 2021, Accepted 04 Jan 2023, Published online: 28 Feb 2023

Cite this article
https://doi.org/10.1080/01621459.2023.2169699
CrossMark

Full Article
Figures & data
References
Supplemental
Citations
Metrics
Reprints & Permissions

References

Ando, R. K., and Zhang, T. (2005), “A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data,” Journal of Machine Learning Research, 6, 1817–1853.
Web of Science ®Google Scholar
Ando, R. K., and Zhang, T. (2007), “Two-View Feature Generation Model for Semi-supervised Learning,” in Proceedings of the 24th International Conference on Machine Learning, eds. Z. Ghahramani, pp. 25–32, New York: Association for Computing Machinery. DOI: 10.1145/1273496.1273500.
Google Scholar
Angrist, J., Chernozhukov, V., and Fernández-Val, I. (2006), “Quantile Regression under Misspecification, with an Application to the US Wage Structure,” Econometrica, 74, 539–563. DOI: 10.1111/j.1468-0262.2006.00671.x.
Web of Science ®Google Scholar
Azriel, D., Brown, L. D., Sklar, M., Berk, R., Buja, A., and Zhao, L. (2021), “Semi-supervised Linear Regression,” Journal of the American Statistical Association, 117, 2238–2251. DOI: 10.1080/01621459.2021.1915320.
Web of Science ®Google Scholar
Bai, Z., Rao, C. R., and Wu, Y. (1992), “M-estimation of Multivariate Linear Regression Parameters under a Convex Discrepancy Function,” Statistica Sinica, 2, 237–54.
Web of Science ®Google Scholar
Begg, M. D., and Lagakos, S. (1990), “On the Consequences of Model Misspecification in Logistic Regression,” Environmental Health Perspectives, 87, 69–75. DOI: 10.1289/ehp.908769.
PubMed Web of Science ®Google Scholar
Belkin, M., Niyogi, P., and Sindhwani, V. (2006), “Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples,” Journal of Machine Learning Research, 7, 2399–2434.
Web of Science ®Google Scholar
Buja, A., Berk, R., Brown, L. D., George, E., Kuchibhotla, A. K., and Zhao, L. (2016), “Models as Approximations Part II: A General Theory of Model-Robust Regression,” arXiv:1612.03257.
Google Scholar
Buja, A., Brown, L. D., Berk, R., George, E., Pitkin, E., Traskin, M., Zhang, K., and Zhao, L. (2019), “Models as Approximations I: Consequences Illustrated with Linear Regression,” Statistical Science, 34, 523–544. DOI: 10.1214/18-STS693.
Web of Science ®Google Scholar
Cai, T. T., and Guo, Z. (2020), “Semisupervised Inference for Explained Variance in High Dimensional Linear Regression and its Applications,” Journal of the Royal Statistical Society, Series B, 82, 391–419. DOI: 10.1111/rssb.12357.
Google Scholar
Chakrabortty, A., and Cai, T. (2018), “Efficient and Adaptive Linear Regression in Semi-supervised Settings,” The Annals of Statistics, 46, 1541–1572. DOI: 10.1214/17-AOS1594.
Web of Science ®Google Scholar
Chakrabortty, A., Dai, G., and Carroll, R. J. (2022), “Semi-supervised Quantile Estimation: Robust and Efficient Inference in High Dimensional Settings,” arXiv:2201.10208.
Google Scholar
Chapelle, O., Schölkopf, B., and Zien, A. (2009), Semi-supervised Learning, Cambridge, MA: MIT Press.
Google Scholar
Cheng, D., Ananthakrishnan, A., and Cai, T. (2018), “Efficient and Robust Semi-supervised Estimation of Average Treatment Effects in Electronic Medical Records Data,” arXiv:1804.00195.
Google Scholar
Cheng, G., and Huang, J. Z. (2010), “Bootstrap Consistency for General Semiparametric M-estimation,” The Annals of Statistics, 38(5):2884–915. DOI: 10.1214/10-AOS809.
Web of Science ®Google Scholar
Cozman, F. G., Cohen, I., and Cirelo, M. C. (2003), “Semi-supervised Learning of Mixture Models,” in Proceedings of the 20th International Conference on Machine Learning, eds. T. Fawcett and N. Mishra, pp. 99–106, Menlo Park, CA: AAAI Press.
Google Scholar
Deng, S., Ning, Y., Zhao, J., and Zhang, H. (2020), “Optimal Semi-supervised Estimation and Inference for High-dimensional Linear Regression,” arXiv:2011.14185.
Google Scholar
Dong, C., Li, G., and Feng, X. (2019), “Lack-of-Fit Tests for Quantile Regression Models,” Journal of the Royal Statistical Society, Series B, 81, 629–648. DOI: 10.1111/rssb.12321.
Google Scholar
Grandvalet, Y., and Bengio, Y. (2004), “Semi-supervised Learning by Entropy Minimization,” in Advances in Neural Information Processing Systems, eds. Y. Weiss, B. Schölkopf, and J. Platt, pp. 529–36, Cambridge, MA: MIT Press.
Google Scholar
Gronsbell, J. L., and Cai, T. (2018), “Semi-supervised Approaches to Efficient Evaluation of Model Prediction Performance,” Journal of the Royal Statistical Society, Series B, 80, 579–594. DOI: 10.1111/rssb.12264.
Google Scholar
He, X., and Zhu, L.-X. (2003), “A Lack-of-Fit Test for Quantile Regression,” Journal of the American Statistical Association, 98, 1013–1022. DOI: 10.1198/016214503000000963.
Web of Science ®Google Scholar
Henmi, M., Yoshida, R., and Eguchi, S. (2007), “Importance Sampling via the Estimated Sampler,” Biometrika, 94, 985–991. DOI: 10.1093/biomet/asm076.
Web of Science ®Google Scholar
Johnson, R., and Zhang, T. (2008), “Graph-based Semi-supervised Learning and Spectral Kernel Design,” IEEE Transactions on Information Theory, 54, 275–288. DOI: 10.1109/TIT.2007.911294.
Web of Science ®Google Scholar
Kawakita, M., and Kanamori, T. (2013), “Semi-supervised Learning with Density-Ratio Estimation,” Machine Learning, 91, 189–209. DOI: 10.1007/s10994-013-5329-8.
Web of Science ®Google Scholar
Kim, T.-H., and White, H. (2003), “Estimation, Inference, and Specification Testing for Possibly Misspecified Quantile Regression,” Advances in Econometrics, 17, 107–132.
Google Scholar
Koenker, R., and Bassett, G. (1978), “Regression Quantiles,” Econometrica, 46, 33–50. DOI: 10.2307/1913643.
Web of Science ®Google Scholar
Koenker, R., and Portnoy, S. (1990), “M-estimation of Multivariate Regressions,” Journal of the American Statistical Association, 85, 1060–1068. DOI: 10.2307/2289602.
Web of Science ®Google Scholar
Kriegler, B., and Berk, R. (2010), “Small Area Estimation of the Homeless in Los Angeles: An Application of Cost-Sensitive Stochastic Gradient Boosting,” The Annals of Applied Statistics, 4, 1234–1255. DOI: 10.1214/10-AOAS328.
Web of Science ®Google Scholar
Lv, J., and Liu, J. S. (2014), “Model Selection Principles in Misspecified Models,” Journal of the Royal Statistical Society, Series B, 76, 141–167. DOI: 10.1111/rssb.12023.
Google Scholar
Nigam, K., McCallum, A. K., Thrun, S., and Mitchell, T. (2000), “Text Classification from Labeled and Unlabeled Documents using EM,” Machine Learning, 39, 103–134. DOI: 10.1023/A:1007692713085.
Web of Science ®Google Scholar
Sokolovska, N., Cappé, O., and Yvon, F. (2008), “The Asymptotics of Semi-supervised Learning in Discriminative Probabilistic Models,” in Proceedings of the 25th International Conference on Machine Learning, eds. W. Cohen, pp. 984–991, New York: Association for Computing Machinery. DOI: 10.1145/1390156.1390280.
Google Scholar
van der Vaart, A. W. (2000), Asymptotic Statistics, Cambridge, UK: Cambridge University Press.
Google Scholar
Wang, H. J., McKeague, I. W., and Qian, M. (2018), “Testing for Marginal Linear Effects in Quantile Regression,” Journal of the Royal Statistical Society, Series B, 80, 433–452. DOI: 10.1111/rssb.12258.
Google Scholar
Wang, J., and Shen, X. (2007), “Large Margin Semi-Supervised Learning,” Journal of Machine Learning Research, 8, 1867–1891.
Web of Science ®Google Scholar
Wang, J., Shen, X., and Liu, Y. (2008), “Probability Estimation for Large-Margin Classifiers,” Biometrika, 95, 149–67. DOI: 10.1093/biomet/asm077.
Web of Science ®Google Scholar
Wasserman, L., and Lafferty, J. (2007), “Statistical Analysis of Semi-supervised Regression,” in Advances in Neural Information Processing Systems (Vol. 20), eds. J. C. Platt, D. Koller, Y. Singer, and S. T. Roweis, pp. 801–808, Cambridge, MA: MIT Press.
Google Scholar
White, H. (1981), “Consequences and Detection of Misspecified Nonlinear Regression Models,” Journal of the American Statistical Association, 76, 419–433. DOI: 10.1080/01621459.1981.10477663.
Web of Science ®Google Scholar
White, H. (1982), “Maximum Likelihood Estimation of Misspecified Models,” Econometrica, 50, 1–25. DOI: 10.2307/1912526.
Web of Science ®Google Scholar
Zeng, D., and Lin, D. Y. (2008), “Efficient Resampling Methods for Nonsmooth Estimating Functions,” Biostatistics, 9, 355–363. DOI: 10.1093/biostatistics/kxm034.
PubMed Web of Science ®Google Scholar
Zhang, A., Brown, L. D., and Cai, T. T. (2019), “Semi-supervised Inference: General Theory and Estimation of Means,” The Annals of Statistics, 47, 2538–2566. DOI: 10.1214/18-AOS1756.
Web of Science ®Google Scholar
Zhang, T., and Oles, F. (2000), “The Value of Unlabeled Data for Classification Problems,” in Proceedings of the 17th International Conference on Machine Learning (Vol. 20), ed. P. Langley, pp. 1191–1198, San Francisco, CA: Morgan Kaufmann.
Google Scholar
Zhou, Q. M., Song, P. X.-K., and Thompson, M. E. (2012), “Information Ratio Test for Model Misspecification in Quasi-Likelihood Inference,” Journal of the American Statistical Association, 107, 205–213. DOI: 10.1080/01621459.2011.645785.
Web of Science ®Google Scholar
Zhu, X., and Goldberg, A. B. (2009), “Introduction to Semi-supervised Learning,” Synthesis Lectures on Artificial Intelligence and Machine Learning, 3, 1–130. DOI: 10.2200/S00196ED1V01Y200906AIM006.
Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

A General M-estimation Theory in Semi-Supervised Framework

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

A General M-estimation Theory in Semi-Supervised Framework

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date