Learning Bilingual Word Embedding Mappings with Similar Words in Related Languages Using GAN

Ghafour Alipoura Department of Computer Engineering, Urmia Branch, Islamic Azad University, Urmia, Iran

https://orcid.org/0000-0002-2070-4334

Jamshid Bagherzadeh Mohasefib Department of Computer Engineering, Urmia University, Urmia, IranCorrespondence[email protected]

Mohammad-Reza Feizi-Derakhshia Department of Computer Engineering, Urmia Branch, Islamic Azad University, Urmia, Iran;c Faculty of Electrical and Computer Engineering, University of Tabriz, Tabriz, Iran

Article: 2019885 | Received 14 Jan 2021, Accepted 09 Dec 2021, Published online: 08 Feb 2022

Cite this article
https://doi.org/10.1080/08839514.2021.2019885
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF View EPUB EPUB

References

Ammar, W., G. Mulcaire, Y. Tsvetkov, G. Lample, C. Dyer, and N. A. Smith. 2016. Massively multilingual word embeddings. CoRR abs/1602.01925. http://arxiv.org/abs/1602.01925
Google Scholar
Arjovsky, M., S. Chintala, and L. Bottou. 2017. Wasserstein GAN. CORR abs/1701.07875
Google Scholar
Artetxe, M., G. Labaka, and E. Agirre. 2016. Learning principled bilingual mappings of word embedding while preserving monolingual invariance. Conference on empirical methods in natural language processing, 2289–1561 .
Google Scholar
Artetxe, M., G. Labaka, and E. Agirre. 2017. Learning bilingual word embeddings with (almost) no bilingual data. Proceedings of ACL, ACL, 451–62.
Google Scholar
Artetxe, M., G. Labaka, and E. Agirre. 2018. Generalizing and improving bilingual word embedding mappings with a multi-step framework of linear transformations, Proceedings of the AAAI Conference on Artificial Intelligence, 32 1 https://ojs.aaai.org/index.php/AAAI/article/view/11992
Google Scholar
Bahdanau, D., K. Cho, and Y. Bengio. 2016. Neural Machine Translation by Jointly Learning to Align and Translate. https://arxiv.org/abs/1409.0473
Google Scholar
Bojanowsk, P., E. Grave, A. Joulin, and T. Mikolov. 2017. Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics 5:135–46. doi:10.1162/tacl_a_00051.
Google Scholar
Collobert, R., and J. Weston. 2008. A unified architecture for natural language processing. Proceedings of the 25th International Conference on Machine Learning - ICML ’08. 20 (1) 160–167.
Google Scholar
Conneau, A., G. Lample, M. Ranzato, L. Denoyer, and H. Jégo. 2018. Word translation without parallel data, 6th International Conference on Learning Representations Vancouver, BC, Canada, OpenReview.net.
Google Scholar
Dinu, Georgiana, Lazaridou, Angeliki, and Baroni, Marco. 2015. Improving zero-shot learning by mitigating the hubness problem, In Proceedings of ICLR (Workshop Track).
Google Scholar
Duong, L., H. Kanayama, T. Ma, S. Bird, and T. Cohn. 2016. Learning cross-lingual word embeddings without bilingual corpora, Proceedings of EMNLP, 1285–1295.
Google Scholar
Faruqui, M., and C. Dyer. 2014. Improving vector space word representations using multilingual correlation. Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, Gothenburg, Sweden, 462–71.
Google Scholar
Goodfellow, I., J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, and S. Ozair. 2014. Generative adversarial nets. Neural Information Processing Systems 27 2672–2680.
Google Scholar
Gouws, S., Y. Bengio, and G. Corrado. 2015. Fast bilingual distributed representations without word alignments. In Proceedings of the 32nd International Conference on Machine Learning (ICML-15), Red Hook, NY, 748–756.
Google Scholar
Hammarström, H., R. Forkel, and M. Haspelmath. 2017. Turkic . In Glottolog 3.0.,” Jena, Germany: Max Planck Institute for the Science of Human History, vol. 3.
Google Scholar
Hauer, Bradley, Garrett, Nicolai, and Grzeg, Kondrak. 2017. Bootstrapping unsupervised bilingual lexicon induction. In Proceedings of EACL, 619–624
Google Scholar
Hoshen, Y., and L. Wolf. 2018. Non-adversarial unsupervised word translation. Proc. of the Conference on Empirical Methods in Natural Language Processing, Association for Computational LinguisticsN. Eight Street, Stroudsburg, PA, 18360United States, 469–78.
Google Scholar
Iacer, Calixto, Qun, Liu, and Nick, Campbell. 2017. Multilingual Multi-modal Embeddings for Natural Language Processing, CoRR, abs/1702.01101
Google Scholar
Jinsong, S., S. Zhenqiao, L. Yaojie, X. Mu, W. Changxing, and C. Yidong. 2018b. Exploring implicit semantic constraints for bilingual word embeddings. Neural Process Letter 48 1073–1088. doi: https://doi.org/10.1007/s11063-017-9762-8
Web of Science ®Google Scholar
Jinsong, S., W. Shan, Z. Biao, W. Changxing, Q. Yue, and X. Deyi. 2018a. A neural generative autoencoder for bilingual word embeddings. Information Sciences 424 287–300. doi:10.1016/j.ins.2017.09.070
Web of Science ®Google Scholar
Joulin, A., E. Grave, P. Bojanowski, and T. Mikolov. 2016. Bag of tricks for efficient text classification. https://arxiv.org/abs/1607.01759v1
Google Scholar
Kondrak, G., B. Hauer, and G. Nicolai. 2017. Bootstrapping unsupervised bilingual lexicon induction. Proceedings of EACL 2 , 619–624. doi:10.18653/v1/E17-2098.
Google Scholar
Lample, G., A. Conneau, L. Denoyer, and M. Ranzato. 2018. Unsupervised machine translation using monolingual corpora only. arXiv:1711.00043
Google Scholar
Lazaridou, A., G. Dinu, and M. Baroni. 2015. Hubness and pollution: Delving into cross space mapping for zero-shot learning. Proceedings of ACL, Beijing, China.
Google Scholar
Levy, O., A. Søgaard, and Y. Goldberg. 2017. A strong baseline for learning cross-lingual word embeddings from sentence alignments. Proceeding of EACL 1 765–774.
Google Scholar
Levy, O., Y. Goldberg, and I. Dagan. 2015. Improving distributional similarity with lessons learned from word embeddings. Transactions of the Association for Computational Linguistics 3:211–25. doi:10.1162/tacl_a_00134.
Google Scholar
Lu, Ang, Wang, Weiran, Bansal, Mohit, Gimple, Kevin, and Livescu, Karen. 2015. Deep multilingual correlation for improved word embeddings Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 1 250–256.
Google Scholar
Luong, M., and C. Manning. 2016. Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models. https://arxiv.org/abs/1604.00788
Google Scholar
Luong, T., H. Pham, and C. D. Manning. 2015. Bilingual word representations with monolingual quality in mind. Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing, 151–59 doi:10.3115/v1/W15-1521.
Google Scholar
Makhzani, A., J. Shlens, N. Jaitly, I. Goodfellow, and B. Frey. 2016. Adversarial autoencoders. https://arxiv.org/abs/1511.05644
Google Scholar
Mikolov, T., I. Sutskever, K. Chen, G. Corrado, and J. Dean. 2013b. Distributed representations of words and phrases and their compositionality. https://arxiv.org/abs/1310.4546
Google Scholar
Mikolov, T., K. Chen, G. Corrado, and J. Dean. 2013a. Efficient estimation of word representations in vector space. arXiv:1301.3781 2:3111–19.
Google Scholar
Mikolov, T., Q. V. Le, and I. Sutskever. 2013. Exploiting similarities among languages for machine translation. https://arxiv.org/abs/1309.4168
Google Scholar
Mogadala, Aditya, and Rettinger, Achim. 2016. Bilingual word embeddings from parallel and nonparallel corpora for cross-language text classification Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 692–702.
Google Scholar
Mrkšić, N., I. Vulić, D. Ó. Séaghdha, İ. Leviant, R. Reichart, M. Gašić, and A. Korh. 2017. Semantic specialization of distributional word vector spaces using monolingual and cross-lingual constraints. Transactions of the Association for Computational Linguistics 5:309–24. doi:10.1162/tacl_a_00063.
Google Scholar
Pennington, J., R. Socher, and C. Manning. 2014. Glove: Global vectors for word representation. D14-1162 2014 Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) Doha, Qatar (Association for Computational Linguistics), 1532–1543 https://aclanthology.org/D14-1162 doi:10.3115/v1/D14-1162.
Google Scholar
Rajendran, Janarthanan, Khapra, Mitesh M, Chandar, Sarath, and Ravindran, Balaraman. 2016. Bridge correlational neural networks for multilingual multimodal representation learning, Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1, 171–181
Google Scholar
Ruder, S., I. Vulic, and A. Søgaard. 2019. A survey of cross-lingual word embedding models. Journal of Artificial Intelligence Research 65:569–631. doi:10.1613/jair.1.11640.
Web of Science ®Google Scholar
Shigeto, Y., I. Suzuki, K. Hara, M. Shimbo, and Y. Matsumoto. 2015. Ridge Regression, Hubness, and Zero-Shot Learning. https://arxiv.org/abs/1507.00825
Google Scholar
Smith, S. L., D. H. Turban, S. Hamblin, and N. Y. Hammerla. Offline bilingual word vectors, orthogonal transformations and the inverted softmax. 5th International Conference on Learning Representations (ICLR 2017), April 24-26 2017 (OpenReview.net) Toulon, France.
Google Scholar
Upadhyay, S., M. Faruqui, C. Dyer, and D. Ro. Cross-lingual models of word embeddings: An empirical comparison. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Vol. 1. ( Long Papers).
Google Scholar
Valerio, A., and M. Barone. 2016. Towards crosslingual distributed representations without parallel text trained with adversarial autoencoders. Proceedings of the 1st Workshop on Representation Learning for NLP Berlin, Germany (Association for Computational Linguistics), 121–126 https://aclanthology.org/W16-16 doi:10.18653/v1/W16-16.
Google Scholar
Vulić, I., and M.-F. Moens. 2015. Bilingual word embeddings from non-parallel DocumentAligned data applied to bilingual lexicon induction. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, China. (Association for Computational Linguistics), 719–725. https://aclanthology.org/P15-2 doi:10.3115/v1/P15-2.
Google Scholar
Vulic´, I., and A. Korhonen. 2016. On the role of seed lexicons in learning bilingual word embeddings. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, (Association for Computational Linguistics), 247–57 https://aclanthology.org/P16-1 doi:10.18653/v1/P16-1.
Google Scholar
Xing, C., D. Wang, C. Liu, and Y. Lin. 2015. Normalized word embedding and orthogonal transform for bilingual word translation. Proceedings of NAACL-HLT Denver, USA (Association for Computational Linguistics), 1005–10.
Google Scholar
Zeman, D., J. Hajič, M. Popel, M. Potthast, M. Straka, F. Ginter, … S. Petrov. 2018. Multilingual parsing from raw text to universal dependencies. In Proceedings of the CoNLL 2017 Shared Task Vancouver, Canada (Association for Computational Linguistics), 1–19 https://aclanthology.org/K17-3 doi:10.18653/v1/K17-3.
Google Scholar
Zhang, M., Y. Liu, H. Luan, and M. Sun. 2017. Adversarial training for unsupervised bilingual lexicon induction. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics Vancouver, Canada, Vol. 1 (Association for Computational Linguistics), 1959–1970 https://aclanthology.org/P17-1 doi:10.18653/v1/P17-1.
Google Scholar
Zhang, Y., D. Gaddy, R. Barzilay, and T. Jaakkola. 2016. Ten pairs to tag – multilingual POS tagging via coarse mapping between embeddings. Proceedings of NAACL-HLT San Diego, USA (Association for Computational Linguistics), 1307–1317.
Google Scholar
Zhang, Y., Y. Li, Y. Zhu, and X. Hu. 2020. Wasserstein GAN based on Autoencoder with back-translation for cross-lingual embedding mappings. Pattern Recognition Letters 129:311–16. doi:10.1016/j.patrec.2019.11.033.
Web of Science ®Google Scholar

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Learning Bilingual Word Embedding Mappings with Similar Words in Related Languages Using GAN

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Learning Bilingual Word Embedding Mappings with Similar Words in Related Languages Using GAN

References

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date