Search in:

Advanced search

Journal of the American Statistical Association Volume 117, 2022 - Issue 537

Submit an article Journal homepage

4,148

Views

CrossRef citations to date

Altmetric

Theory and Methods

Embedding Learning

Ben Daia School of Statistics, University of Minnesota, Minneapolis, MN; Correspondence[email protected]
View further author information

Xiaotong Shena School of Statistics, University of Minnesota, Minneapolis, MN; View further author information

Junhui Wangb School of Data Science, City University of Hong Kong, Kowloon, Hong KongView further author information

Pages 307-319 | Received 15 Dec 2018, Accepted 25 May 2020, Published online: 20 Jul 2020

Cite this article
https://doi.org/10.1080/01621459.2020.1775614
CrossMark

Full Article
Figures & data
References
Supplemental
Citations
Metrics
Reprints & Permissions

References

Bădoiu, M., Demaine, E. D., Hajiaghayi, M., Sidiropoulos, A., and Zadimoghaddam, M. (2008), “Ordinal Embedding: Approximation Algorithms and Dimensionality Reduction,” in Approximation, Randomization and Combinatorial Optimization. Algorithms and Techniques, eds. A. Goel, K. Jansen, J. D. P. Rolim, and R. Rubinfeld, Berlin, Heidelberg: Springer, pp. 21–34.
Google Scholar
Belkin, M., and Niyogi, P. (2002), “Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering,” in Advances in Neural Information Processing Systems, pp. 585–591.
Google Scholar
Berge, C. (1973), Graphs and Hypergraphs, Rome: Food and Agriculture Organization of the United States.
Google Scholar
Casella, G., and Berger, R. L. (2002), Statistical Inference, Pacific Grove, CA: Duxbury.
Google Scholar
Chen, B., He, S., Li, Z., and Zhang, S. (2012), “Maximum Block Improvement and Polynomial Optimization,” SIAM Journal on Optimization, 22, 87–107. DOI: https://doi.org/10.1137/110834524.
Web of Science ®Google Scholar
Cortes, C., and Vapnik, V. (1995), “Support-Vector Networks,” Machine Learning, 20, 273–297. DOI: https://doi.org/10.1007/BF00994018.
Web of Science ®Google Scholar
Covington, P., Adams, J., and Sargin, E. (2016), “Deep Neural Networks for YouTube Recommendations,” in Proceedings of the 10th ACM Conference on Recommender Systems, ACM, pp. 191–198.
Google Scholar
de Bruin, T., Kober, J., Tuyls, K., and Babuška, R. (2018), “Integrating State Representation Learning Into Deep Reinforcement Learning,” IEEE Robotics and Automation Letters, 3, 1394–1401. DOI: https://doi.org/10.1109/LRA.2018.2800101.
Google Scholar
Fan, R. E., Chang, K. W., Hsieh, C. J., Wang, X. R., and Lin, C. J. (2008), “LIBLINEAR: A Library for Large Linear Classification,” Journal of Machine Learning Research, 9, 1871–1874.
Web of Science ®Google Scholar
Fellbaum, C. (2010), “WordNet,” in Theory and Applications of Ontology: Computer Applications, eds. R. Poli, M. Healy, and A. Kameas, Dordrecht: Springer, pp. 231–243.
Google Scholar
Gelfand, S. B., and Mitter, S. K. (1991), “Recursive Stochastic Algorithms for Global Optimization in Rd,” SIAM Journal on Control and Optimization, 29, 999–1018. DOI: https://doi.org/10.1137/0329055.
Web of Science ®Google Scholar
Genkin, A., Lewis, D. D., and Madigan, D. (2007), “Large-Scale Bayesian Logistic Regression for Text Categorization,” Technometrics, 49, 291–304. DOI: https://doi.org/10.1198/004017007000000245.
Web of Science ®Google Scholar
Glorot, X., Bordes, A., and Bengio, Y. (2011), “Deep Sparse Rectifier Neural Networks,” in International Conference on Artificial Intelligence and Statistics, pp. 315–323.
Google Scholar
Grover, A., and Leskovec, J. (2016), “node2vec: Scalable Feature Learning for Networks,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, pp. 855–864.
Google Scholar
Guo, Y. (2009), “Supervised Exponential Family Principal Component Analysis via Convex Optimization,” in Advances in Neural Information Processing Systems, pp. 569–576.
Google Scholar
Hastie, T., Friedman, J., and Tibshirani, R. (2009), The Elements of Statistical Learning: Data Mining, Inference, and Prediction, New York: Springer.
Google Scholar
He, H., Balakrishnan, A., Eric, M., and Liang, P. (2017), “Learning Symmetric Collaborative Dialogue Agents With Dynamic Knowledge Graph Embeddings,” arXiv no. 1704.07130.
Google Scholar
Hu, M., and Liu, B. (2004), “Mining and Summarizing Customer Reviews,” in Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, pp. 168–177.
Google Scholar
Le, Q., and Mikolov, T. (2014), “Distributed Representations of Sentences and Documents,” in International Conference on Machine Learning, pp. 1188–1196.
Google Scholar
Lee, J. D., Simchowitz, M., Jordan, M. I., and Recht, B. (2016), “Gradient Descent Only Converges to Minimizers,” in Conference on Learning Theory, pp. 1246–1257.
Google Scholar
Linnainmaa, S. (1976), “Taylor Expansion of the Accumulated Rounding Error,” BIT Numerical Mathematics, 16, 146–160. DOI: https://doi.org/10.1007/BF01931367.
Google Scholar
Mazumder, R., Hastie, T., and Tibshirani, R. (2010), “Spectral Regularization Algorithms for Learning Large Incomplete Matrices,” Journal of Machine Learning Research, 11, 2287–2322.
PubMed Web of Science ®Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., and Dean, J. (2013), “Distributed Representations of Words and Phrases and Their Compositionality,” in Advances in Neural Information Processing Systems, pp. 3111–3119.
Google Scholar
Mikolov, T., Yih, W.-T., and Zweig, G. (2013), “Linguistic Regularities in Continuous Space Word Representations,” in Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 746–751.
Google Scholar
Miratrix, L., Jia, J., Gawalt, B., Yu, B., and Ghaoui, L. E. (2011), “Summarizing Large-Scale, Multiple-Document News Data: Sparse Methods and Human Validation.”
Google Scholar
Niyogi, P., and Girosi, F. (1996), “On the Relationship Between Generalization Error, Hypothesis Complexity, and Sample Complexity for Radial Basis Functions,” Neural Computation, 8, 819–842. DOI: https://doi.org/10.1162/neco.1996.8.4.819.
Web of Science ®Google Scholar
Pang, B., and Lee, L. (2005), “Seeing Stars: Exploiting Class Relationships for Sentiment Categorization With Respect to Rating Scales,” in Annual Meeting on Association for Computational Linguistics, Association for Computational Linguistics, pp. 115–124.
Google Scholar
Pennington, J., Socher, R., and Manning, C. (2014), “Glove: Global Vectors for Word Representation,” in Proceeding of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1532–1543.
Google Scholar
Pratt, L. Y. (1993), “Discriminability-Based Transfer Between Neural Networks,” in Advances in Neural Information Processing Systems, pp. 204–211.
Google Scholar
Rish, I., Grabarnik, G., Cecchi, G. A., Pereira, F., and Gordon, G. J. (2008), “Closed-Form Supervised Dimensionality Reduction With Generalized Linear Models,” in ICML (Vol. 8), pp. 832–839. DOI: https://doi.org/10.1145/1390156.1390261.
Google Scholar
Rockafellar, R., and Wets, R. (2011), Variational Analysis (Vol. 317), Berlin, Heidelberg: Springer.
Google Scholar
Roweis, S. T., and Saul, L. K. (2000), “Nonlinear Dimensionality Reduction by Locally Linear Embedding,” Science, 290, 2323–2326. DOI: https://doi.org/10.1126/science.290.5500.2323.
PubMed Web of Science ®Google Scholar
Schmidt-Hieber, J. (2017), “Nonparametric Regression Using Deep Neural Networks With ReLU Activation Function,” arXiv no. 1708.06633.
Google Scholar
Shen, X., Tseng, G. C., Zhang, X., and Wong, W. H. (2003), “On ψ-Learning,” Journal of the American Statistical Association, 98, 724–734. DOI: https://doi.org/10.1198/016214503000000639.
Web of Science ®Google Scholar
Shen, X., and Wang, L. (2007), “Generalization Error for Multi-Class Margin Classification,” Electronic Journal of Statistics, 1, 307–330. DOI: https://doi.org/10.1214/07-EJS069.
Web of Science ®Google Scholar
Shen, X., and Wong, W. H. (1994), “Convergence Rate of Sieve Estimates,” The Annals of Statistics, 22, 580–615. DOI: https://doi.org/10.1214/aos/1176325486.
Web of Science ®Google Scholar
Socher, R., Perelygin, A., Wu, J., Chuang, J., Manning, C. D., Ng, A., and Potts, C. (2013), “Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank,” in Proceedings of 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1631–1642.
Google Scholar
Stark, C., Breitkreutz, B.-J., Chatr-Aryamontri, A., Boucher, L., Oughtred, R., Livstone, M. S., Nixon, J., Van Auken, K., Wang, X., Shi, X., and Reguly, T. (2010), “The BioGRID Interaction Database: 2011 Update,” Nucleic Acids Research, 39, D698–D704. DOI: https://doi.org/10.1093/nar/gkq1116.
PubMed Web of Science ®Google Scholar
Taddy, M. (2013), “Multinomial Inverse Regression for Text Analysis,” Journal of the American Statistical Association, 108, 755–770. DOI: https://doi.org/10.1080/01621459.2012.734168.
Web of Science ®Google Scholar
Tibshirani, R. (1996), “Regression Shrinkage and Selection via the Lasso,” Journal of the Royal Statistical Society, Series B, 58, 267–288. DOI: https://doi.org/10.1111/j.2517-6161.1996.tb02080.x.
Google Scholar
Toutanova, K., Klein, D., Manning, C. D., and Singer, Y. (2003), “Feature-Rich Part-of-Speech Tagging With a Cyclic Dependency Network,” in Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, Association for Computational Linguistics, pp. 173–180. DOI: https://doi.org/10.3115/1073445.1073478.
Google Scholar
Tsybakov, A. B. (2004), “Optimal Aggregation of Classifiers in Statistical Learning,” The Annals of Statistics, 32, 135–166. DOI: https://doi.org/10.1214/aos/1079120131.
Web of Science ®Google Scholar
Udell, M., Horn, C., Zadeh, R., and Boyd, S. (2016), “Generalized Low Rank Models,” Foundations and Trends® in Machine Learning, 9, 1–118. DOI: https://doi.org/10.1561/2200000055.
Google Scholar
Vapnik, V. (1998), Statistical Learning Theory (Vol. 3), New York: Wiley.
Google Scholar
Wang, J., Shen, X., Sun, Y., and Qu, A. (2016), “Classification With Unstructured Predictors and an Application to Sentiment Analysis,” Journal of the American Statistical Association, 111, 1242–1253. DOI: https://doi.org/10.1080/01621459.2015.1089771.
Web of Science ®Google Scholar
Weinberger, K., Dasgupta, A., Attenberg, J., Langford, J., and Smola, A. (2009), “Feature Hashing for Large Scale Multitask Learning,” arXiv no. 0902.2206. DOI: https://doi.org/10.1145/1553374.1553516.
Google Scholar
Xu, Y., and Wang, X. (2018), “Understanding Weight Normalized Deep Neural Networks With Rectified Linear Units,” in Advances in Neural Information Processing Systems, pp. 130–139.
Google Scholar
Yarotsky, D. (2017), “Error Bounds for Approximations With Deep ReLU Networks,” Neural Networks, 94, 103–114. DOI: https://doi.org/10.1016/j.neunet.2017.07.002.
PubMed Web of Science ®Google Scholar
Zhu, J., and Hastie, T. (2002), “Kernel Logistic Regression and the Import Vector Machine,” in Advances in Neural Information Processing Systems, pp. 1081–1088.
Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Embedding Learning

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Embedding Learning

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date