References
- Baum, M. A., & Zhukov, Y. M. (2019). Media ownership and news coverage of international conflict. Political Communication, 36(1), 36–63. http://sci-hub.tw/10.1080/10584609.2018.1483606
- Benoit, K., Watanabe, K., Wang, H., Nulty, P., Obeng, A., Müller, S., & Matsuo, A. (2018). quanteda: An R package for the quantitative analysis of textual data. Journal of Open Source Software, 3(30), 774. https://doi.org/org/0.21105/joss.00774
- Blei, D. M. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77–84. http://sci-hub.tw/10.1145/2133806.2133826
- Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2017). Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5(Dec), 135–146. http://sci-hub.tw/10.1162/tacl_a_00051
- Boumans, J. W., & Trilling, D. (2016). Taking stock of the toolkit: An overview of relevant automated content analysis approaches and techniques for digital journalism scholars. Digital Journalism, 4(1), 8–23. http://sci-hub.tw/10.1080/21670811.2015.1096598
- Chang, J., Gerrish, S., Wang, C., Boyd-Graber, J. L., & Blei, D. M. (2009). Reading tea leaves: How humans interpret topic models. Advances in Neural Information Processing Systems, Vancouver, Canada, 288–296. https://papers.nips.cc/paper/3700-reading-tea-leaves-how-humans-interpret-topic-models
- Chmielewski, M., & Kucker, S. C. (2019). An MTurk crisis? Shifts in data quality and the impact on study results. Social Psychological and Personality Science, 1948550619875149. http://sci-hub.tw/10.1177/1948550619875149
- Conneau, A., Lample, G., Ranzato, M., Denoyer, L., & Jégou, H. (2017). Word translation without parallel data. arXiv Preprint arXiv:1710.04087. https://arxiv.org/abs/1710.04087
- De Vries, E., Schoonvelde, M., & Schumacher, G. (2018). No longer lost in translation: Evidence that Google Translate works for comparative bag-of-words text applications. Political Analysis, 26(4), 417–430. http://sci-hub.tw/10.1017/pan.2018.26
- Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv Preprint arXiv:1810.04805. https://arxiv.org/abs/1810.04805
- Eshima, S., Imai, K., & Sasaki, T. (2020). Keyword assisted topic models. arXiv Preprint arXiv, 2004, 05964. https://arxiv.org/abs/2004.05964
- Firth, J. R. (1957). A synopsis of linguistic theory, 1930–1955. Studies in Linguistic Analysis. Longman.
- Fung, I. C.-H., Zeng, J., Chan, C.-H., Liang, H., Yin, J., Liu, Z., Tse, Z. T. H., & Fu, K.-W. (2018). Twitter and Middle East respiratory syndrome, South Korea, 2015: A multi-lingual study. Infection, Disease & Health, 23(1), 10–16. http://sci-hub.tw/10.1016/j.idh.2017.08.005
- Grimmer, J., & Stewart, B. M. (2013). Text as data: The promise and pitfalls of automatic content analysis methods for political texts. Political Analysis, 21(3), 267–297. http://sci-hub.tw/10.1093/pan/mps028
- Hatzivassiloglou, V., Klavans, J. L., & Eskin, E. (1999). Detecting text similarity over short passages: Exploring linguistic feature combinations via machine learning. In 1999 Joint SIGDAT conference on empirical methods in natural language processing and very large corpora, College Park, MD, USA. https://www.aclweb.org/anthology/W99-0625
- Jacobi, C., van Atteveldt, W., & Welbers, K. (2016). Quantitative analysis of large amounts of journalistic texts using topic modelling. Digital Journalism, 4(1), 89–106. http://sci-hub.tw/10.1080/21670811.2015.1093271
- Joulin, A., Bojanowski, P., Mikolov, T., Jégou, H., & Grave, E. (2018). Loss in translation: Learning bilingual word mapping with a retrieval criterion. arXiv Preprint arXiv:1804.07745. https://arxiv.org/abs/1804.07745
- Katki, H. A., Li, Y., Edelstein, D. W., & Castle, P. E. (2012). Estimating the agreement and diagnostic accuracy of two diagnostic tests when one test is conducted on only a subsample of specimens. Statistics in Medicine, 31(5), 436–448. http://sci-hub.tw/10.1002/sim.4422
- Koltsova, O., & Koltcov, S. (2013). Mapping the public agenda with topic modeling: The case of the Russian livejournal. Policy & Internet, 5(2), 207–227. http://sci-hub.tw/10.1002/1944-2866.POI331
- Lebret, R., Iovleff, S., Langrognet, F., Biernacki, C., Celeux, G., & Govaert, G. (2015). Rmixmod: The R package of the model-based unsupervised, supervised and semi-supervised classification mixmod library. https://hal.archives-ouvertes.fr/hal-00919486/document
- Lind, F., Eberl, J.-M., Heidenreich, T., & Boomgaarden, H. G. (2019b). When the journey is as important as the goal: A roadmap to multilingual dictionary construction. International Journal of Communication, 13(1), 21. https://ijoc.org/index.php/ijoc/article/view/10578
- Lind, F., Eisele, O., Heidenreich, T., Galyga, S., Eberl, J.-M., & Boomgaarden, H. G. (2019a). A bridge over the language gap—Employing topic modelling for text analyses across languages for country comparative research. Presented at the POLTEXT Conference, Tokyo.
- Livingstone, S. (2003). On the challenges of cross-national comparative media research. European Journal of Communication, 18(4), 477–500. http://sci-hub.tw/10.1177/0267323103184003
- Lucas, C., Nielsen, R. A., Roberts, M. E., Stewart, B. M., Storer, A., & Tingley, D. (2015). Computer-assisted text analysis for comparative politics. Political Analysis, 23(2), 254–277. http://sci-hub.tw/10.1093/pan/mpu019
- Maier, D., Waldherr, A., Miltner, P., Wiedemann, G., Niekler, A., Keinert, A., Pfetsch, B., Heyer, G., Reber, U., Häussler, T., Schmid-Petri, H., & Adam, S. (2018). Applying LDA topic modeling in communication research: Toward a valid and reliable methodology. Communication Methods and Measures, 12(2–3), 93–118. http://sci-hub.tw/10.1080/19312458.2018.1430754
- Mikolov, T., Le, Q. V., & Sutskever, I. (2013a). Exploiting similarities among languages for machine translation. ArXiv Preprint ArXiv:1309.4168. http://arxiv.org/abs/1309.4168
- Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013b). Distributed representations of words and phrases and their compositionality. Advances in Neural Information Processing Systems, 3111–3119. https://arxiv.org/abs/1310.4546
- Mimno, D., Wallach, H. M., Naradowsky, J., Smith, D. A., & McCallum, A. (2009). Polylingual topic models. In Proceedings of the 2009 Conference on empirical methods in natural language processing: (Vol. 2, pp. 880–889). Association for Computational Linguistics.
- Pires, T., Schlinger, E., & Garrette, D. (2019). How multilingual is Multilingual BERT? arXiv Preprint arXiv:1906.01502. https://arxiv.org/pdf/1906.01502.pdf
- Proksch, S.-O., Lowe, W., Wäckerle, J., & Soroka, S. (2019). Multilingual sentiment analysis: A new approach to measuring conflict in legislative speeches. Legislative Studies Quarterly, 44(1), 97–131. http://sci-hub.tw/10.1111/lsq.12218
- Pruss, D., Fujinuma, Y., Daughton, A. R., Paul, M. J., Arnot, B., Szafir, D. A., & Boyd-Graber, J. (2019). Zika discourse in the Americas: A multilingual topic analysis of Twitter. PloS One, 14(5), e0216922. https://doi.org/https://10.1371/journal.pone.0216922
- Reber, U. (2019). Overcoming language barriers: Assessing the potential of machine translation and topic modeling for the comparative analysis of multilingual text corpora. Communication Methods and Measures, 13(2), 102–125. http://sci-hub.tw/10.1080/19312458.2018.1555798
- Roberts, M. E., Stewart, B. M., & Tingley, D. (2014). stm: R package for structural topic models. Journal of Statistical Software, 10(2), 1–40. http://sci-hub.tw/10.18637/jss.v091.i02
- Rudkowsky, E., Haselmayer, M., Wastian, M., Jenny, M., Emrich, Š., & Sedlmair, M. (2018). More than bags of words: Sentiment analysis with word embeddings. Communication Methods and Measures, 12(2–3), 140–157. http://sci-hub.tw/10.1080/19312458.2018.1455817
- Sandve, G. K., Nekrutenko, A., Taylor, J., & Hovig, E. (2013). Ten simple rules for reproducible computational research. PLoS Computational Biology, 9(10), e1003285. http://sci-hub.tw/10.1371/journal.pcbi.1003285
- Shireman, E., Steinley, D., & Brusco, M. J. (2017). Examining the effect of initialization strategies on the performance of Gaussian mixture modeling. Behavior Research Methods, 49(1), 282–293. http://sci-hub.tw/10.3758/s13428-015-0697-6
- Snow, R., O’Connor, B., Jurafsky, D., & Ng, A. Y. (2008). Cheap and fast—But is it good?: Evaluating non-expert annotations for natural language tasks. In Proceedings of the conference on empirical methods in natural language processing (pp. 254–263). Association for Computational Linguistics.
- Van Atteveldt, W., Althaus, S., & Wessler, H. (2020). The trouble with sharing your privates: Pursuing ethical open science and collaborative research across national jurisdictions using sensitive data. Political Communication, 1–7. http://sci-hub.tw/10.1080/10584609.2020.1744780
- Van Atteveldt, W., Van, Strycharz, J., Trilling, D., & Welbers, K. (2019). Toward open computational communication science: A practical road map for reusable data and code. International Journal of Communication, 13(5), 20. https://ijoi.org/index.php/ijoc/article/view/10631
- Watanabe, K., & Zhou, Y. (2020). Theory-driven analysis of large corpora: Semisupervised topic classification of the UN speeches. Social Science Computer Review, 0894439320907027. http://sci-hub.tw/10.1177/0894439320907027
- Wittgenstein, L. (1953). Philosophical investigations. John Wiley & Sons.
- Xie, P., & Xing, E. P. (2013). Integrating document clustering and topic modeling. ArXiv Preprint ArXiv:1309.6874. https://arxiv.org/abs/1309.6874
- Yan, X., Guo, J., Lan, Y., & Cheng, X. (2013). A biterm topic model for short texts. In Proceedings of the 22nd international conference on World Wide Web, Rio de Janerio, Brazil (pp. 1445–1456). Association for Computing Machinery.
- Zhang, D., Mei, Q., & Zhai, C. (2010, July). Cross-lingual latent topic extraction. In Proceedings of the 48th annual meeting of the association for computational linguistics (pp. 1128–1137). Association for Computational Linguistics. http://sci-hub.tw/10.5555/1858681.1858796