3,779
Views
2
CrossRef citations to date
0
Altmetric
Research Articles

Dictionary-based and machine learning classification approaches: a comparison for tonality and frame detection on Twitter data

ORCID Icon & ORCID Icon
Article: 2029217 | Received 05 Apr 2021, Accepted 11 Jan 2022, Published online: 01 Feb 2022

References

  • Albaugh, Q., J. Sevenans, S. Soroka, and P. J. Loewen. 2013. “The Automated Coding of Policy Agendas: A Dictionary-based Approach”. In Proceedings of the sixth Annual Comparative Agendas Conference, 1-27.
  • Amsler, M. 2020. Using Lexical-Semantic Concepts for Fine-Grained Classification in the Embedding Space. Doctoral dissertation. University of Zurich, Faculty of Arts.
  • Amsler, M., B. Wüest, and G. Schneider. 2016. “Legitimacy of New Forms of Governance in Public Discourse - An Automated Media Content Analysis Approach Driven By Techniques of Computational Linguistics”. In Proceedings of the International Conference on the Advances in Computational Analysis of Political Text (PolText), 1–7.
  • Barberá, P., A. Boydstun, S. Linn, R. McMahon, and J. Nagler. 2021. “Automated Text Classification of News Articles: A Practical Guide.” Political Analysis 29 (1): 19–42. doi:10.1017/pan.2020.8
  • Benoit, K., K. Watanabe, H. Wang, P. Nulty, A. Obeng, S. Müller, and A. Matsuo. 2018. “quanteda: An R Package for the Quantitative Analysis of Textual Data.” Journal of Open Source Software 3 (30): 1–4. doi:10.21105/joss.00774
  • Blei, D. M., A. Y. Ng, and M. I. Jordan. 2003. “Latent Dirichlet Allocation.” Journal of Machine Learning Research 3: 993–1022.
  • Card, D., J. H. Gross, A. Boydstun, and N. A. Smith. 2016. “Analyzing Framing Through the Casts of Characters in the News”. In Proceedings of the conference on empirical methods in natural language processing, 1410-1420.
  • Cortes, C., L. D. Jackel, S. A. Solla, V. Vapnik, and J. S. Denker. 1994. “Learning Curves: Asymptotic Values and Rate of Convergence”. In Proceedings of the sixth International conference on neural information processing system, 327-334.
  • Devlin, J., M. W. Chang, K. Lee, and K. Toutanova. 2018. “Bert: Pre-training of Deep Bidirectional Transformers for Language Understanding”. arXiv preprint arXiv:1810.04805.
  • DiMaggio, P., M. Nag, and D. Blei. 2013. “Exploiting Affinities Between Topic Modeling and the Sociological Perspective on Culture: Application to Newspaper Coverage of US Government Arts Funding.” Poetics 41 (6): 570–606. doi:10.1016/j.poetic.2013.08.004
  • Ferrín, M. 2018. European Social Survey Round 10 Module Design Teams (QDT) Stage 2 Application. London: Centre for Comparative Social Surveys, City University London. http://www.europeansocialsurvey.org/docs/round10/questionnaire/ESS10_ferrin_proposal.pdf.
  • Fishman, R. M. 2016. “Rethinking Dimensions of Democracy for Empirical Analysis: Authenticity, Quality, Depth, and Consolidation.” Annual Review of Political Science 19: 289–309. doi:10.1146/annurev-polisci-042114-015910
  • García-Marín, J., and A. Calatrava. 2018. “The use of Supervised Learning Algorithms in Political Communication and Media Studies: Locating Frames in the Press.” Comunicación y Sociedad 31 (3): 175–188.
  • Gilardi, F., T. Gessler, M. Kubli, and S. Müller. 2021. “Social Media and Political Agenda Setting.” Political Communication, 1–22. doi:10.1080/10584609.2021.1910390
  • Gilardi, F., Shipan, C. R., & Wüest, B. 2021. “Policy Diffusion: The Issue–definition Stage.” American Journal of Political Science 65 (1): 21–35. doi:10.1111/ajps.12521.
  • Gilardi, F., and B. Wüest. 2018. ““Using Text-as-Data Methods in Comparative Policy Analysis”.” In Handbook of Research Methods and Applications in Comparative Policy Analysis, edited by G. Peters, and G. Fontaine, 203–217. Northampton: Edward Elgar Publishing.
  • Goffman, E. 1974. Frame Analysis: An Essay on the Organization of Experience. New York, NY: Harper & Row.
  • Grimmer, J., and B. Stewart. 2013. “Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts.” Political Analysis 21 (3): 267–297. doi:10.1093/pan/mps028
  • Haixiang, G., L. Yijing, J. Shang, G. Mingyun, H. Yuanyue, and G. Bing. 2017. “Learning from Class-Imbalanced Data: Review of Methods and Applications.” Expert Systems with Applications 73: 220–239. doi:10.1016/j.eswa.2016.12.035
  • Hamilton, W. L., K. Clark, J. Leskovec, and D. Jurafsky. 2016. “Inducing Domain Specific Sentiment Lexicons from Unlabeled Corpora”. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, 595-605.
  • Hartmann, J., J. Huppertz, C. Schamp, and M. Heitmann. 2019. “Comparing Automated Text Classification Methods.” International Journal of Research in Marketing 36 (1): 20–38. doi:10.1016/j.ijresmar.2018.09.009
  • Hu, M., and B. Liu. 2004. “Mining and Summarizing Customer Reviews”. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, 168-177.
  • Iliev, R., M. Dehghani, and E. Sagi. 2015. “Automated Text Analysis in Psychology: Methods, Applications, and Future Developments.” Language and Cognition 7 (2): 265–290. doi:10.1017/langcog.2014.30
  • Japec, L., F. Kreuter, M. Berg, P. Biemer, P. Decker, C. Lampe, J. Lane, C. O’Neil, and A. Usher. 2015. “Big Data in Survey Research: AAPOR Task Force Report.” Public Opinion Quarterly 79 (4): 839–880. doi:10.1093/poq/nfv039
  • Joulin, A., E. Grave, P. Bojanowski, and T. Mikolov. 2017. “Bag of Tricks for Efficient Text Classification”. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2, 427-431.
  • Kraft, P. W. 2018. “Measuring Morality in Political Attitude Expression.” Journal of Politics 80 (3): 1028–1033. doi:10.1086/696862
  • Kriesi, H., L. Morlino, P. Magalhães, S. Alonso, and M. Ferrín. 2013. European Social Survey Round 6 Module on Europeans’ Understandings and Evaluations of Democracy – Final Module in Template. London: Centre for Comparative Social Surveys, City University London. https://www.europeansocialsurvey.org/docs/round6/questionnaire/ESS6_final_understandings_and_evaluation_of_democracy_module_template.pdf.
  • Laver, M., K. Benoit, and J. Garry. 2003. “Extracting Policy Positions from Political Texts Using Words as Data.” American Political Science Review 97 (2): 311–331.
  • Loughran, T., and B. McDonald. 2011. “When is a Liability not a Liability? Textual Analysis, Dictionaries, and 10-Ks.” The Journal of Finance 66 (1): 35–65. doi:10.1111/j.1540-6261.2010.01625.x
  • Maerz, S. F. 2019. “Simulating Pluralism: The Language of Democracy in Hegemonic Authoritarianism.” Political Research Exchange 1 (1): 1–23. doi:10.1080/2474736X.2019.1605834
  • Mikolov, T., I. Sutskever, K. Chen, G. Corrado, and J. Dean. 2013. “Distributed Representations of Words and Phrases and Their Compositionality”. arXiv preprint arXiv:1310.4546.
  • Miller, G. A. 1995. “WordNet: A Lexical Database for English.” Communications of the ACM 38 (11): 39–41. doi:10.1145/219717.219748
  • Mohammad, S. M., and P. D. Turney. 2013. “Nrc emotion lexicon”. National Research Council, Canada, 2. http://www.saifmohammad.com/WebDocs/NRCemotionlexicon.pdf.
  • Neuendorf, K. A. 2016. The Content Analysis Guidebook. Thousand Oaks: Sage.
  • Nielsen, F. A. 2011. “AFINN. Denmark: Informatics; Mathematical Modelling”. Technical University of Denmark. https://bit.ly/2pZzWL4.
  • Pennebaker, J. W., R. L. Boyd, K. Jordan, and K. Blackburn. 2015. “The Development and Psychometric Properties of LIWC2015”. https://repositories.lib.utexas.edu/handle/2152/31333.
  • Riekert, M., M. Riekert, and A. Klein. 2021. “Simple Baseline Machine Learning Text Lassifiers for Small Datasets.” SN Computer Science 2 (178): 1–16.
  • Roberts, M. E., B. M. Stewart, D. Tingley, C. Lucas, J. Leder-Luis, S. K. Gadarian, … D. G. Rand. 2014. “Structural Topic Models for Open-Ended Survey Responses.” American Journal of Political Science 58 (4): 1064–1082. doi:10.1111/ajps.12103
  • Rocchio, J. 1971. ““Relevance Feedback in Information Retrieval”.” In The SMART Retrieval System: Experiments in Automatic Document Processing, edited by S. F. Dierk, 313–323. Englewood Cliffs, NJ: Prentice-Hall Inc.
  • Rooduijn, M., and T. Pauwels. 2011. “Measuring Populism: Comparing two Methods of Content Analysis.” West European Politics 34 (6): 1272–1283. doi:10.1080/01402382.2011.616665
  • Schober, M. F., J. Pasek, L. Guggenheim, C. Lampe, and F. G. Conrad. 2016. “Social Media Analyses for Social Measurement.” Public Opinion Quarterly 80 (1): 180–211. doi:10.1093/poq/nfv048
  • Schwartz, H. A., and L. H. Ungar. 2015. “Data-Driven Content Analysis of Social Media: A Systematic Overview of Automated Methods.” The ANNALS of the American Academy of Political and Social Science 659 (1): 78–94. doi:10.1177/0002716215569197
  • Shen, D., G. Wang, W. Wang, M. R. Min, Q. Su, Y. Zhang, C. Li, R. Henao, and L. Carin. 2018. “Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms”. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 1, 440-450.
  • Stone, P. J., D. C. Dunphy, and M. S. Smith. 1966. The General Inquirer: A Computer Approach to Content Analysis. Cambridge: MIT Press.
  • Vicente, I. S., and X. Saralegi. 2016. “Polarity Lexicon Building: To What Extent Is the Manual Effort Worth?”. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), 938-942.
  • Wüest, B., M. Amsler, and G. Schneider. 2017. “SIFT–A Language Technology Toolkit to Assess the Print Media Coverage of New Forms of Governance”. Working paper series/NCCR-Democracy, (95).
  • Young, L., and S. Soroka. 2012. “Affective News: The Automated Coding of Sentiment in Political Texts.” Political Communication 29 (2): 205–231. doi:10.1080/10584609.2012.671234
  • Zhang, X., J. Zhao, and Y. LeCun. 2015. “Character-Level Convolutional Networks for Text Classification”. In Advances in Neural Information Processing Systems, 649-657.