References
- Arnold, T. (2017a). cleannlp: A tidy data model for natural language processing [Computer software manual] ( R package version 1.9.0). Retrieved from https://CRAN.R-project.org/package=cleanNLP
- Arnold, T. (2017b). kerasR: R interface to the keras deep learning library [Computer software manual] ( R package version 0.6.1). Retrieved from https://CRAN.R-project.org/package=kerasR
- Arnold, T., & Tilton, L. (2016). coreNLP: Wrappers around Stanford CoreNLP tools [Computer software manual] ( R package version 0.4-2). Retrieved from https://CRAN.R-project.org/package=coreNLP
- Aue, A., & Gamon, M. (2005). Customizing sentiment classifiers to new domains: A case study. In Proceedings of Recent Advances in Natural Language Processing (RANLP). Retrieved from http://research.microsoft.com/pubs/65430/new_domain_sentiment.pdf
- Bates, D., & Maechler, M. (2015). Matrix: Sparse and dense matrix classes and methods [Computer software manual] ( R package version 1.2-3). Retrieved from https://CRAN.R-project.org/package=Matrix
- Benoit, K., & Matsuo, A. (2017). spacyr: R Wrapper to the spaCY NLP Library [Computer software manual] ( R package version 0.9.0). Retrieved from https://CRAN.R-project.org/package=spacyr
- Benoit, K., Watanabe, K., Nulty, P., Obeng, A., Wang, H., Lauderdale, B., & Lowe, W. (2017). quanteda: Quantitative analysis of textual data [Computer software manual] ( R package version 0.99). Retrieved from http://quanteda.io
- Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. The Journal of Machine Learning Research, 3, 993–1022.
- Bouchet-Valat, M. (2014). SnowballC: Snowball Stemmers based on the C Libstemmer UTF-8 Library [Computer software manual] ( R package version 0.5.1). Retrieved from https://CRAN.R-project.org/package=SnowballC
- Boumans, J. W., & Trilling, D. (2016). Taking stock of the toolkit: An overview of relevant automated content analysis approaches and techniques for digital journalism scholars. Digital Journalism, 4(1), 8–23. doi:10.1080/21670811.2015.1096598
- Crone, S. F., Lessmann, S., & Stahlbock, R. (2006). The impact of preprocessing on data mining: An evaluation of classifier sensitivity in direct marketing. European Journal of Operational Research, 173(3), 781–800. doi:10.1016/j.ejor.2005.07.023
- De Smedt, T., & Daelemans, W. (2012). “vreselijk mooi!” (terribly beautiful): A subjectivity lexicon for dutch adjectives. In Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC), Istanbul, May 2012, 3568–3572.
- Feinerer, I., & Hornik, K. (2017). tm: Text mining package [Computer software manual] ( R package version 0.7-1). Retrieved from https://CRAN.R-project.org/package=tm
- Flesch, R. (1948). A new readability yardstick. Journal of Applied Psychology, 32(3), 221–233. doi:10.1037/h0057532
- Fox, J., & Leanage, A. (2016). R and the journal of statistical software. Journal of Statistical Software, 73(2), 1–13.
- Gagolewski, M. (2017). R package stringi: Character string processing facilities [Computer software manual]. Retrieved from http://www.gagolewski.com/software/stringi/
- Gardner, M. J., Lutes, J., Lund, J., Hansen, J., Walker, D., Ringger, E., & Seppi, K. (2010). The topic browser: An interactive tool for browsing topic models. In Nips workshop on Challenges of Data Visualization. Retrieved from http://cseweb.ucsd.edu/~lvdmaaten/workshops/nips2010/papers/gardner.pdf
- Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. In Proceedings of the National Academy of Sciences, 5228–5235. doi:10.1073/pnas.0307752101
- Grimmer, J., & Stewart, B. M. (2013). Text as data: The promise and pitfalls of automatic content analysis methods for political texts. Political Analysis, 21(3), 267–297. doi:10.1093/pan/mps028
- Grun, B., & Hornik, K. (2011). topicmodels: An R package for fitting topic models. Journal of Statistical Software, 40(13), 1–30. doi:10.18637/jss.v040.i13
- Günther, E., & Quandt, T. (2016). Word counts and topic models: Automated text analysis methods for digital journalism research. Digital Journalism, 4(1), 75–88. doi:10.1080/21670811.2015.1093270
- Jurka, T. P., Collingwood, L., Boydstun, A. E., Grossman, E., & Van Atteveldt, W. (2014). RTextTools: Automatic text classification via supervised learning [Computer software manual] ( R package version 1.4.2). Retrieved from https://CRAN.R-project.org/package=RTextTools
- Lang, D. T., & the CRAN Team. (2017). XML: Tools for parsing and generating XML within R and S-plus [Computer software manual]. Retrieved from https://CRAN.R-project.org/package=XML
- Leopold, E., & Kindermann, J. (2002). Text categorization with support vector machines. How to represent texts in input space? Machine Learning, 46(1), 423–444. doi:10.1023/a:1012491419635
- Manning, C. D., Manning, C. D., Raghavan, P., Raghavan, P., Schütze, H., & Schütze, H. (2008). Introduction to information retrieval. Cambridge, UK: Cambridge University Press. doi:10.1017/cbo9780511809071
- Manning, C. D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S. J., & McClosky, D. (2014). The Stanford CoreNLP Natural Language Processing Toolkit. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations (pp. 55–60). doi: 10.3115/v1/p14-5010
- McCarthy, P. M., & Jarvis, S. (2010). Mtld, vocd-d, and hd-d: A validation study of sophisticated approaches to lexical diversity assessment. Behavior Research Methods, 42(2), 381–392. doi:10.3758/brm.42.2.381
- McLuhanm, M. (1964). Understanding Media: The Extensions of Man. New York: Pinguin Press.
- Meyer, D., Hornik, K., & Feinerer, I. (2008). Text mining infrastructure in r. Journal of Statistical Software, 25(5), 1–54. doi:10.18637/jss.v025.i05
- Michalke, M. (2017). koRpus: An R package for text analysis [Computer software manual] ( Version 0.10-2). Retrieved from https://reaktanz.de/?c=hacking&s=koRpus
- Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. In Proceedings of International Conference of Learning Representations. arXiv preprint arXiv:1301.3781, Scottsdale, Arizona, May 2013.
- Mostafa, M. M. (2013). More than words: Social networks’ text mining for consumer brand sentiments. Expertat Systems with Applications, 40(10), 4241–4251. doi:10.1016/j.eswa.2013.01.019
- Mullen, L. (2016a). textreuse: Detect text reuse and document similarity [Computer software manual] ( R package version 0.1.4). Retrieved from https://CRAN.R-project.org/package=textreuse.
- Mullen, L. (2016b). tokenizers: A consistent interface to tokenize natural language text [Computer software manual] ( R package version 0.1.4). Retrieved from https://CRAN.R-project.org/package=tokenizers
- Ooms, J. (2014). The jsonlite package: A practical and consistent mapping between json data and r objects [Computer software manual]. Retrieved from https://arxiv.org/abs/1403.2805
- Ooms, J. (2017a). antiword: Extract text from microsoft word documents [Computer software manual]. Retrieved from https://CRAN.R-project.org/package=antiword
- Ooms, J. (2017b). pdftools: Text extraction, rendering and converting of pdf documents [Computer software manual]. Retrieved from https://CRAN.R-project.org/package=pdftools
- Porter, M. F. (2001). Snowball: A language for stemming algorithms. Retrieved from http://snowball.tartarus.org/texts/introduction.html
- Proksch, S.-O., & Slapin, J. B. (2009). How to avoid pitfalls in statistical analysis of political texts: The case of germany. German Politics, 18(3), 323–344. doi:10.1080/09644000903055799
- Provost, F., & Fawcett, T. (2013). Data science and its Relationship to Big Data and Data-Driven Decision Making. Big Data, 1(1), 51–59. doi:10.1089/big.2013.1508
- R Core Team. (2017). R: A language and environment for statistical computing [Computer software manual]. Vienna, Austria. Retrieved from https://www.R-project.org/
- Roberts, M. E., Stewart, B. M., Tingley, D., Lucas, C., Leder-Luis, J., Gadarian, S. K., … Rand, D. G. (2014). Structural topic models for open-ended survey responses. American Journal of Political Science, 58(4), 1064–1082. doi:10.1111/ajps.12103
- rOpenSci Text Workshop. (2017). tif: Text interchange format [Computer software manual]. Retrieved from https://github.com/ropensci/tif
- Schuck, A. R., Xezonakis, G., Elenbaas, M., Banducci, S. A., & De Vreese, C. H. (2011). Party contestation and Europe on the news agenda: The 2009 European Parliamentary Elections. Electoral Studies, 30(1), 41–52. doi:10.1016/j.electstud.2010.09.021
- Schultz, F., Kleinnijenhuis, J., Oegema, D., Utz, S., & Van Atteveldt, W. (2012). Strategic framing in the BP crisis: A semantic network analysis of associative frames. Public Relations Review, 38(1), 97–107. doi:10.1016/j.pubrev.2011.08.003t
- Selivanov, D. (2016). text2vec: Modern text mining framework for R [Computer software manual] ( R package version 0.4.0). Retrieved from https://CRAN.R-project.org/package=text2vec
- Silge, J., & Robinson, D. (2016). tidytext: Text mining and analysis using tidy data principles in R. Journal of Open Source Software, 1, 3. doi:10.21105/joss.00037
- Slapin, J. B., & Proksch, S.-O. (2008). A scaling model for estimating time-series party positions from texts. American Journal of Political Science, 52(3), 705–722. doi:10.1111/j.1540-5907.2008.00338.x
- Taboada, M., Brooke, J., Tofiloski, M., Voll, K., & Stede, M. (2011). Lexicon-based methods for sentiment analysis. Computational Linguistics, 37(2), 267–307. doi:10.1162/coli_a_00049
- Tausczik, Y. R., & Pennebaker, J. W. (2010). The psychological meaning of words: LIWC and computerized text analysis methods. Journal of Language and Social Psychology, 29(1), 24–54. doi:10.1177/0261927X09351676
- TIOBE. (2017). The R programming language. Retrieved from https://www.tiobe.com/tiobe-index/r/
- Van Atteveldt, W. (2008). Semantic Network Analysis: Techniques for Extracting, Representing, and Querying Media Content ( Dissertation). Charleston, SC: BookSurge.
- Van Atteveldt, W., Sheafer, T., Shenhav, S. R., & Fogel-Dror, Y. (2017). Clause analysis: Using syntactic information to automatically extract source, subject, and predicate from texts with an application to the 2008–2009 Gaza War. Political Analysis, 25(2), 207–222. doi:10.1017/pan.2016.12
- Vliegenthart, R., Boomgaarden, H. G., & Van Spanje, J. (2012). Anti-immigrant party support and media visibility: A cross-party, over-time perspective. Journal of Elections, Public Opinion & Parties, 22(3), 315–358. doi:10.1080/17457289.2012.693933
- Watanabe, K. (2017). The spread of the Kremlin’s narratives by a western news agency during the Ukraine crisis. The Journal of International Communication, 23(1), 138–158. doi:10.1080/13216597.2017.1287750
- Welbers, K., & Van Atteveldt, W. (2016). corpustools: Tools for managing, querying and analyzing tokenized text [Computer software manual] ( R package version 0.201). Retrieved from http://github.com/kasperwelbers/corpustools
- Welbers, K., Van Atteveldt, W., Kleinnijenhuis, J., & Ruigrok, N. (2016). A Gatekeeper among Gatekeepers: News Agency Influence in Print and Online Newspapers in the Netherlands. Journalism Studies, 1–19 (online first). doi:10.1080/1461670x.2016.1190663
- Wickham, H. (2014). Tidy Data. Journal of Statistical Software, 59(10), 1–23. doi:10.18637/jss.v059.i10
- Wickham, H., & Bryan, J. (2017). readxl: Read excel files [Computer software manual]. Retrieved from https://CRAN.R-project.org/package=readxl
- Wild, F. (2017). Cran task view: Natural language processing. CRAN. Version: 2017-01-17. Retrieved from https://CRAN.R-project.org/view=NaturalLanguageProcessing.
- Yang, Y., & Pedersen, J. O. (1997). A comparative study on feature selection in text categorization. In Proceedings of the Fourteenth International Conference on Machine Learning (ICML) (pp. 412–420), Nashville, TN, July 1997.