Search in:

Advanced search

Journal of Quantitative Linguistics Volume 29, 2022 - Issue 1

Submit an article Journal homepage

435

Views

CrossRef citations to date

Altmetric

Research Article

A Word Embedding Model for Analyzing Patterns and Their Distributional Semantics

Rui FengSchool of International Studies, Zhejiang University, Hangzhou, ChinaView further author information

Congcong YangSchool of International Studies, Zhejiang University, Hangzhou, ChinaView further author information

Yunhua QuSchool of International Studies, Zhejiang University, Hangzhou, ChinaCorrespondence[email protected]
View further author information

Pages 80-105 | Published online: 07 Jun 2020

Cite this article
https://doi.org/10.1080/09296174.2020.1767481
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions

References

Alstott, J., Bullmore, E., & Plenz, D. (2014). powerlaw: A Python package for analysis of heavy-tailed distributions. PloS One, 9(1), e85777. https://doi.org/https://doi.org/10.1371/journal.pone.0085777
PubMed Web of Science ®Google Scholar
Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate. International Conference on Learning Representations.
Google Scholar
Bengio, Y., Ducharme, R., Vincent, R., & Jauvin, C. (2003). A neural probabilistic language model. Journal of Machine Learning Research, 3, 1137–1155.
Google Scholar
Biber, D. (2009). A corpus-driven approach to formulaic language in English. International Journal of Corpus Linguistics, 14(3), 275–311. https://doi.org/https://doi.org/10.1075/ijcl.14.3.08bib
Web of Science ®Google Scholar
Clauset, A., Shalizi, C. R., & Newman, M. E. (2009). Power-law distributions in empirical data. SIAM Review, 51(4), 661–703. https://doi.org/https://doi.org/10.1137/070710111
Web of Science ®Google Scholar
Dai, X., Liang, Y., & Qu, Y. (2016). A synergetic approach to the relationship between the length and frequency of Chinese formulaic sequences. Journal of Zhejiang Education Institute, 6, 24–31. https://doi.org/https://doi.org/10.3969/j.issn.2095-2074.2016.06.004
Google Scholar
Dai, X, Qu, Y, & Feng, W. (2018). A synergetic approach to the relationship between the length and frequency among english multiword formulaic sequences. In Journal of quantitative linguistics (pp. 22–37).
Google Scholar
Dai, X. T, Qu, Y. H, & Feng, Z. W. (2018). A synergetic approach to the relationship between the length and frequency among English multiword formulaic sequences. Journal of Quantitative Linguistics, 25(1), 22–37. https://doi.org/https://doi.org/10.1080/09296174.2017.1338119
Web of Science ®Google Scholar
Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., & Harshman, R. (1990). Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41(6), 391–407. doi.https://doi.org/10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9
Web of Science ®Google Scholar
Deng, Y., & Feng, Z. (2013). A quantitative lingllistic study on the relationship between word length and word frequency. Journal of Foreign Languages, 36(3), 29–39.
Google Scholar
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). BERT: Pre-training of deep bidirectional transformers for language understanding. Retrieved from https://arxiv.org/abs/1810.04805
Google Scholar
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780. https://doi.org/https://doi.org/10.1162/neco.1997.9.8.1735
PubMed Web of Science ®Google Scholar
Hofmann, T. (1999). Probabilistic latent semantic analysis. In Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence, 289–296. https://arxiv.org/pdf/1301.6705.pdf
Google Scholar
Huang, E. H., Socher, R., Manning, C. D., & Ng, A. Y. (2012). Improving word representations via global context and multiple word prototypes. Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers, 1, 873–882.
Google Scholar
Hunston, S., & Francis, G. (2000). Pattern grammar: A corpus-driven approach to the lexical grammar of English. Amsterdam: Benjamins.
Google Scholar
Landauer, T. K., Foltz, P. W., & Laham, D. (1998). An introduction to latent semantic analysis. Discourse Processes, 25(2–3), 259–284. https://doi.org/https://doi.org/10.1080/01638539809545028
Web of Science ®Google Scholar
Le, Q. V., & Mikolov, T. (2014). Distributed representations of sentences and documents. Proceedings of the 31st International Conference on International Conference on Machine Learning, 32, II-1188–II-1196. Retrieved from https://arxiv.org/abs/1405.4053
Google Scholar
Levy, O., & Goldberg, Y. (2014). Neural word embedding as implicit matrix factorization. Advances In Neural Information Processing Systems, 3, 2177–2185.
Google Scholar
Li, W. (1992). Random texts exhibit Zipf’s-law-like word frequency distribution. IEEE Transactions on Information Theory, 38(6), 1842–1845. https://doi.org/https://doi.org/10.1109/18.165464
Web of Science ®Google Scholar
Liu, Y., Liu, Z., Chua, T.-S., & Sun, M. (2015). Topical word embeddings. Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2418–2424.
Google Scholar
Luong, T., Pham, H., & Manning, C. D. (2015). Effective approaches to attention-based neural machine translation. Empirical Methods In Natural Language Processing,1412–1421. https://doi.org/https://doi.org/10.18653/v1/D15-1166
Google Scholar
Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. In Proceedings of the international conference on learning representations, Scottsdale, Arizona, USA.
Google Scholar
Mikolov, T., Karafiat, M., Burget, L., Cernocky, J., & Khudanpur, S. (2010). Recurrent neural network based language model. In Eleventh annual conference of the international speech communication association, Makuhari, Chiba, Japan.
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2014). Distributed representations of words and phrases. Proceedings of the 26th International Conference on Neural Information Processing Systems, 2, 3111–3119.
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. Proceedings of the 26th International Conference on Neural Information Processing Systems, 2, 3111–3119.
Google Scholar
Mikolov, T., Yih, W.-T., & Zweig, G. (2013). Linguistic regularities in continuous space word representations. In Proceedings of the 2013 conference of the North American chapter of the association for computational linguistics: Human language technologies, 746–751, Atlanta, Georgia. Retrieved from https://www.aclweb.org/anthology/N13-1090
Google Scholar
Miller, G. A., Newman, E. B., & Friedman, E. A. (1958). Length-frequency statistics for written English. Information and Control, 1(4), 370–389. https://doi.org/https://doi.org/10.1016/S0019-9958(58)90229-8
Google Scholar
Newman, M. E. (2005). Power laws, Pareto distributions and Zipf’s law. Contemporary Physics, 46(5), 323–351. https://doi.org/https://doi.org/10.1080/00107510500052444
Web of Science ®Google Scholar
Pennington, J., Socher, R., & Manning, C. (2014). Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (Emnlp), Doha, Qatar.
Google Scholar
Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L. (2018). Deep contextualized word representations. In Proceedings of the conference of the North American chapter of the association for computational linguistics: Human language technologies, Atlanta, Georgia, USA.
Google Scholar
Reisinger, J., & Mooney, R. J. (2010). Multi-prototype vector-space models of word meaning. Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, 109–117.
Google Scholar
Ren, Y., Zhang, Y., Zhang, M., & Ji, D. (2016). Improving twitter sentiment classification using topic-enriched multi-prototype word embeddings. Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 3038–3044.
Google Scholar
Schuster, M., & Paliwal, K. K. (1997). Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 45(11), 2673–2681. https://doi.org/https://doi.org/10.1109/78.650093
Web of Science ®Google Scholar
Tian, F. A., Bian, J., Gao, B., Zhang, R., Chen, E., & Liu, T.-Y. (2014). A probabilistic model for learning multi-prototype word embeddings. In Proceedings of COLING 2014, the 25th international conference on computational linguistics, Dublin, Ireland.
Google Scholar
Vendler, Z. (1967). Verbs and times. In Linguistics in Philosophy (pp. 97–121). Ithaca, New York: Cornell University Press.
Google Scholar
Verkuyl, H. J. (1972). On the compositional nature of the aspects. In Foundations of language (supplementary series), 15. Dordrecht: Reidel.
Google Scholar
Wang, X. Y, Feng, Z. W, Zhang, D, & Qu, Y. H. (2018). Modern chinese sentence extended pattern grammar model for natural language processing. Journal Of Xiamen University (Natural Science), 57, 6. doi: https://doi.org/10.6043/j.issn.0438-0479.201805007
Google Scholar
Wyllys, R. E. (1981). Empirical and theoretical bases of Zipf’s law. Graduate School of Library and Information Science. University of Illinois.
Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

A Word Embedding Model for Analyzing Patterns and Their Distributional Semantics

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

A Word Embedding Model for Analyzing Patterns and Their Distributional Semantics

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date