435
Views
1
CrossRef citations to date
0
Altmetric
Research Article

A Word Embedding Model for Analyzing Patterns and Their Distributional Semantics

, &

References

  • Alstott, J., Bullmore, E., & Plenz, D. (2014). powerlaw: A Python package for analysis of heavy-tailed distributions. PloS One, 9(1), e85777. https://doi.org/https://doi.org/10.1371/journal.pone.0085777
  • Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate. International Conference on Learning Representations.
  • Bengio, Y., Ducharme, R., Vincent, R., & Jauvin, C. (2003). A neural probabilistic language model. Journal of Machine Learning Research, 3, 1137–1155.
  • Biber, D. (2009). A corpus-driven approach to formulaic language in English. International Journal of Corpus Linguistics, 14(3), 275–311. https://doi.org/https://doi.org/10.1075/ijcl.14.3.08bib
  • Clauset, A., Shalizi, C. R., & Newman, M. E. (2009). Power-law distributions in empirical data. SIAM Review, 51(4), 661–703. https://doi.org/https://doi.org/10.1137/070710111
  • Dai, X., Liang, Y., & Qu, Y. (2016). A synergetic approach to the relationship between the length and frequency of Chinese formulaic sequences. Journal of Zhejiang Education Institute, 6, 24–31. https://doi.org/https://doi.org/10.3969/j.issn.2095-2074.2016.06.004
  • Dai, X, Qu, Y, & Feng, W. (2018). A synergetic approach to the relationship between the length and frequency among english multiword formulaic sequences. In Journal of quantitative linguistics (pp. 22–37).
  • Dai, X. T, Qu, Y. H, & Feng, Z. W. (2018). A synergetic approach to the relationship between the length and frequency among English multiword formulaic sequences. Journal of Quantitative Linguistics, 25(1), 22–37. https://doi.org/https://doi.org/10.1080/09296174.2017.1338119
  • Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., & Harshman, R. (1990). Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41(6), 391–407. doi.https://doi.org/10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9
  • Deng, Y., & Feng, Z. (2013). A quantitative lingllistic study on the relationship between word length and word frequency. Journal of Foreign Languages,  36(3), 29–39.
  • Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). BERT: Pre-training of deep bidirectional transformers for language understanding. Retrieved from https://arxiv.org/abs/1810.04805
  • Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780. https://doi.org/https://doi.org/10.1162/neco.1997.9.8.1735
  • Hofmann, T. (1999). Probabilistic latent semantic analysis. In Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence, 289–296. https://arxiv.org/pdf/1301.6705.pdf
  • Huang, E. H., Socher, R., Manning, C. D., & Ng, A. Y. (2012). Improving word representations via global context and multiple word prototypes. Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers, 1, 873–882.
  • Hunston, S., & Francis, G. (2000). Pattern grammar: A corpus-driven approach to the lexical grammar of English. Amsterdam: Benjamins.
  • Landauer, T. K., Foltz, P. W., & Laham, D. (1998). An introduction to latent semantic analysis. Discourse Processes, 25(2–3), 259–284. https://doi.org/https://doi.org/10.1080/01638539809545028
  • Le, Q. V., & Mikolov, T. (2014). Distributed representations of sentences and documents. Proceedings of the 31st International Conference on International Conference on Machine Learning, 32, II-1188–II-1196. Retrieved from https://arxiv.org/abs/1405.4053
  • Levy, O., & Goldberg, Y. (2014). Neural word embedding as implicit matrix factorization. Advances In Neural Information Processing Systems, 3, 2177–2185.
  • Li, W. (1992). Random texts exhibit Zipf’s-law-like word frequency distribution. IEEE Transactions on Information Theory, 38(6), 1842–1845. https://doi.org/https://doi.org/10.1109/18.165464
  • Liu, Y., Liu, Z., Chua, T.-S., & Sun, M. (2015). Topical word embeddings. Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2418–2424.
  • Luong, T., Pham, H., & Manning, C. D. (2015). Effective approaches to attention-based neural machine translation. Empirical Methods In Natural Language Processing,1412–1421. https://doi.org/https://doi.org/10.18653/v1/D15-1166
  • Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. In Proceedings of the international conference on learning representations, Scottsdale, Arizona, USA.
  • Mikolov, T., Karafiat, M., Burget, L., Cernocky, J., & Khudanpur, S. (2010). Recurrent neural network based language model. In Eleventh annual conference of the international speech communication association, Makuhari, Chiba, Japan.
  • Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2014). Distributed representations of words and phrases. Proceedings of the 26th International Conference on Neural Information Processing Systems, 2, 3111–3119.
  • Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. Proceedings of the 26th International Conference on Neural Information Processing Systems, 2, 3111–3119.
  • Mikolov, T., Yih, W.-T., & Zweig, G. (2013). Linguistic regularities in continuous space word representations. In Proceedings of the 2013 conference of the North American chapter of the association for computational linguistics: Human language technologies, 746–751, Atlanta, Georgia. Retrieved from https://www.aclweb.org/anthology/N13-1090
  • Miller, G. A., Newman, E. B., & Friedman, E. A. (1958). Length-frequency statistics for written English. Information and Control, 1(4), 370–389. https://doi.org/https://doi.org/10.1016/S0019-9958(58)90229-8
  • Newman, M. E. (2005). Power laws, Pareto distributions and Zipf’s law. Contemporary Physics, 46(5), 323–351. https://doi.org/https://doi.org/10.1080/00107510500052444
  • Pennington, J., Socher, R., & Manning, C. (2014). Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (Emnlp), Doha, Qatar.
  • Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L. (2018). Deep contextualized word representations. In Proceedings of the conference of the North American chapter of the association for computational linguistics: Human language technologies, Atlanta, Georgia, USA.
  • Reisinger, J., & Mooney, R. J. (2010). Multi-prototype vector-space models of word meaning. Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, 109–117.
  • Ren, Y., Zhang, Y., Zhang, M., & Ji, D. (2016). Improving twitter sentiment classification using topic-enriched multi-prototype word embeddings. Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 3038–3044.
  • Schuster, M., & Paliwal, K. K. (1997). Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 45(11), 2673–2681. https://doi.org/https://doi.org/10.1109/78.650093
  • Tian, F. A., Bian, J., Gao, B., Zhang, R., Chen, E., & Liu, T.-Y. (2014). A probabilistic model for learning multi-prototype word embeddings. In Proceedings of COLING 2014, the 25th international conference on computational linguistics, Dublin, Ireland.
  • Vendler, Z. (1967). Verbs and times. In Linguistics in Philosophy (pp. 97–121). Ithaca, New York: Cornell University Press.
  • Verkuyl, H. J. (1972). On the compositional nature of the aspects. In Foundations of language (supplementary series), 15. Dordrecht: Reidel.
  • Wang, X. Y, Feng, Z. W, Zhang, D, & Qu, Y. H. (2018). Modern chinese sentence extended pattern grammar model for natural language processing. Journal Of Xiamen University (Natural Science), 57, 6. doi: https://doi.org/10.6043/j.issn.0438-0479.201805007
  • Wyllys, R. E. (1981). Empirical and theoretical bases of Zipf’s law. Graduate School of Library and Information Science. University of Illinois.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.