4,148
Views
4
CrossRef citations to date
0
Altmetric
Theory and Methods

Embedding Learning

, &
Pages 307-319 | Received 15 Dec 2018, Accepted 25 May 2020, Published online: 20 Jul 2020

References

  • Bădoiu, M., Demaine, E. D., Hajiaghayi, M., Sidiropoulos, A., and Zadimoghaddam, M. (2008), “Ordinal Embedding: Approximation Algorithms and Dimensionality Reduction,” in Approximation, Randomization and Combinatorial Optimization. Algorithms and Techniques, eds. A. Goel, K. Jansen, J. D. P. Rolim, and R. Rubinfeld, Berlin, Heidelberg: Springer, pp. 21–34.
  • Belkin, M., and Niyogi, P. (2002), “Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering,” in Advances in Neural Information Processing Systems, pp. 585–591.
  • Berge, C. (1973), Graphs and Hypergraphs, Rome: Food and Agriculture Organization of the United States.
  • Casella, G., and Berger, R. L. (2002), Statistical Inference, Pacific Grove, CA: Duxbury.
  • Chen, B., He, S., Li, Z., and Zhang, S. (2012), “Maximum Block Improvement and Polynomial Optimization,” SIAM Journal on Optimization, 22, 87–107. DOI: https://doi.org/10.1137/110834524.
  • Cortes, C., and Vapnik, V. (1995), “Support-Vector Networks,” Machine Learning, 20, 273–297. DOI: https://doi.org/10.1007/BF00994018.
  • Covington, P., Adams, J., and Sargin, E. (2016), “Deep Neural Networks for YouTube Recommendations,” in Proceedings of the 10th ACM Conference on Recommender Systems, ACM, pp. 191–198.
  • de Bruin, T., Kober, J., Tuyls, K., and Babuška, R. (2018), “Integrating State Representation Learning Into Deep Reinforcement Learning,” IEEE Robotics and Automation Letters, 3, 1394–1401. DOI: https://doi.org/10.1109/LRA.2018.2800101.
  • Fan, R. E., Chang, K. W., Hsieh, C. J., Wang, X. R., and Lin, C. J. (2008), “LIBLINEAR: A Library for Large Linear Classification,” Journal of Machine Learning Research, 9, 1871–1874.
  • Fellbaum, C. (2010), “WordNet,” in Theory and Applications of Ontology: Computer Applications, eds. R. Poli, M. Healy, and A. Kameas, Dordrecht: Springer, pp. 231–243.
  • Gelfand, S. B., and Mitter, S. K. (1991), “Recursive Stochastic Algorithms for Global Optimization in Rd,” SIAM Journal on Control and Optimization, 29, 999–1018. DOI: https://doi.org/10.1137/0329055.
  • Genkin, A., Lewis, D. D., and Madigan, D. (2007), “Large-Scale Bayesian Logistic Regression for Text Categorization,” Technometrics, 49, 291–304. DOI: https://doi.org/10.1198/004017007000000245.
  • Glorot, X., Bordes, A., and Bengio, Y. (2011), “Deep Sparse Rectifier Neural Networks,” in International Conference on Artificial Intelligence and Statistics, pp. 315–323.
  • Grover, A., and Leskovec, J. (2016), “node2vec: Scalable Feature Learning for Networks,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, pp. 855–864.
  • Guo, Y. (2009), “Supervised Exponential Family Principal Component Analysis via Convex Optimization,” in Advances in Neural Information Processing Systems, pp. 569–576.
  • Hastie, T., Friedman, J., and Tibshirani, R. (2009), The Elements of Statistical Learning: Data Mining, Inference, and Prediction, New York: Springer.
  • He, H., Balakrishnan, A., Eric, M., and Liang, P. (2017), “Learning Symmetric Collaborative Dialogue Agents With Dynamic Knowledge Graph Embeddings,” arXiv no. 1704.07130.
  • Hu, M., and Liu, B. (2004), “Mining and Summarizing Customer Reviews,” in Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, pp. 168–177.
  • Le, Q., and Mikolov, T. (2014), “Distributed Representations of Sentences and Documents,” in International Conference on Machine Learning, pp. 1188–1196.
  • Lee, J. D., Simchowitz, M., Jordan, M. I., and Recht, B. (2016), “Gradient Descent Only Converges to Minimizers,” in Conference on Learning Theory, pp. 1246–1257.
  • Linnainmaa, S. (1976), “Taylor Expansion of the Accumulated Rounding Error,” BIT Numerical Mathematics, 16, 146–160. DOI: https://doi.org/10.1007/BF01931367.
  • Mazumder, R., Hastie, T., and Tibshirani, R. (2010), “Spectral Regularization Algorithms for Learning Large Incomplete Matrices,” Journal of Machine Learning Research, 11, 2287–2322.
  • Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., and Dean, J. (2013), “Distributed Representations of Words and Phrases and Their Compositionality,” in Advances in Neural Information Processing Systems, pp. 3111–3119.
  • Mikolov, T., Yih, W.-T., and Zweig, G. (2013), “Linguistic Regularities in Continuous Space Word Representations,” in Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 746–751.
  • Miratrix, L., Jia, J., Gawalt, B., Yu, B., and Ghaoui, L. E. (2011), “Summarizing Large-Scale, Multiple-Document News Data: Sparse Methods and Human Validation.”
  • Niyogi, P., and Girosi, F. (1996), “On the Relationship Between Generalization Error, Hypothesis Complexity, and Sample Complexity for Radial Basis Functions,” Neural Computation, 8, 819–842. DOI: https://doi.org/10.1162/neco.1996.8.4.819.
  • Pang, B., and Lee, L. (2005), “Seeing Stars: Exploiting Class Relationships for Sentiment Categorization With Respect to Rating Scales,” in Annual Meeting on Association for Computational Linguistics, Association for Computational Linguistics, pp. 115–124.
  • Pennington, J., Socher, R., and Manning, C. (2014), “Glove: Global Vectors for Word Representation,” in Proceeding of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1532–1543.
  • Pratt, L. Y. (1993), “Discriminability-Based Transfer Between Neural Networks,” in Advances in Neural Information Processing Systems, pp. 204–211.
  • Rish, I., Grabarnik, G., Cecchi, G. A., Pereira, F., and Gordon, G. J. (2008), “Closed-Form Supervised Dimensionality Reduction With Generalized Linear Models,” in ICML (Vol. 8), pp. 832–839. DOI: https://doi.org/10.1145/1390156.1390261.
  • Rockafellar, R., and Wets, R. (2011), Variational Analysis (Vol. 317), Berlin, Heidelberg: Springer.
  • Roweis, S. T., and Saul, L. K. (2000), “Nonlinear Dimensionality Reduction by Locally Linear Embedding,” Science, 290, 2323–2326. DOI: https://doi.org/10.1126/science.290.5500.2323.
  • Schmidt-Hieber, J. (2017), “Nonparametric Regression Using Deep Neural Networks With ReLU Activation Function,” arXiv no. 1708.06633.
  • Shen, X., Tseng, G. C., Zhang, X., and Wong, W. H. (2003), “On ψ-Learning,” Journal of the American Statistical Association, 98, 724–734. DOI: https://doi.org/10.1198/016214503000000639.
  • Shen, X., and Wang, L. (2007), “Generalization Error for Multi-Class Margin Classification,” Electronic Journal of Statistics, 1, 307–330. DOI: https://doi.org/10.1214/07-EJS069.
  • Shen, X., and Wong, W. H. (1994), “Convergence Rate of Sieve Estimates,” The Annals of Statistics, 22, 580–615. DOI: https://doi.org/10.1214/aos/1176325486.
  • Socher, R., Perelygin, A., Wu, J., Chuang, J., Manning, C. D., Ng, A., and Potts, C. (2013), “Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank,” in Proceedings of 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1631–1642.
  • Stark, C., Breitkreutz, B.-J., Chatr-Aryamontri, A., Boucher, L., Oughtred, R., Livstone, M. S., Nixon, J., Van Auken, K., Wang, X., Shi, X., and Reguly, T. (2010), “The BioGRID Interaction Database: 2011 Update,” Nucleic Acids Research, 39, D698–D704. DOI: https://doi.org/10.1093/nar/gkq1116.
  • Taddy, M. (2013), “Multinomial Inverse Regression for Text Analysis,” Journal of the American Statistical Association, 108, 755–770. DOI: https://doi.org/10.1080/01621459.2012.734168.
  • Tibshirani, R. (1996), “Regression Shrinkage and Selection via the Lasso,” Journal of the Royal Statistical Society, Series B, 58, 267–288. DOI: https://doi.org/10.1111/j.2517-6161.1996.tb02080.x.
  • Toutanova, K., Klein, D., Manning, C. D., and Singer, Y. (2003), “Feature-Rich Part-of-Speech Tagging With a Cyclic Dependency Network,” in Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, Association for Computational Linguistics, pp. 173–180. DOI: https://doi.org/10.3115/1073445.1073478.
  • Tsybakov, A. B. (2004), “Optimal Aggregation of Classifiers in Statistical Learning,” The Annals of Statistics, 32, 135–166. DOI: https://doi.org/10.1214/aos/1079120131.
  • Udell, M., Horn, C., Zadeh, R., and Boyd, S. (2016), “Generalized Low Rank Models,” Foundations and Trends® in Machine Learning, 9, 1–118. DOI: https://doi.org/10.1561/2200000055.
  • Vapnik, V. (1998), Statistical Learning Theory (Vol. 3), New York: Wiley.
  • Wang, J., Shen, X., Sun, Y., and Qu, A. (2016), “Classification With Unstructured Predictors and an Application to Sentiment Analysis,” Journal of the American Statistical Association, 111, 1242–1253. DOI: https://doi.org/10.1080/01621459.2015.1089771.
  • Weinberger, K., Dasgupta, A., Attenberg, J., Langford, J., and Smola, A. (2009), “Feature Hashing for Large Scale Multitask Learning,” arXiv no. 0902.2206. DOI: https://doi.org/10.1145/1553374.1553516.
  • Xu, Y., and Wang, X. (2018), “Understanding Weight Normalized Deep Neural Networks With Rectified Linear Units,” in Advances in Neural Information Processing Systems, pp. 130–139.
  • Yarotsky, D. (2017), “Error Bounds for Approximations With Deep ReLU Networks,” Neural Networks, 94, 103–114. DOI: https://doi.org/10.1016/j.neunet.2017.07.002.
  • Zhu, J., and Hastie, T. (2002), “Kernel Logistic Regression and the Import Vector Machine,” in Advances in Neural Information Processing Systems, pp. 1081–1088.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.