3,478
Views
47
CrossRef citations to date
0
Altmetric
Article

A computational social science perspective on qualitative data exploration: Using topic models for the descriptive analysis of social media data*

ORCID Icon &
Pages 54-86 | Received 03 Jul 2018, Accepted 05 May 2019, Published online: 08 Jun 2019

References

  • Bail, C. A. (2016). Combining natural language processing and network analysis to examine how advocacy organizations stimulate conversation on social media. Proceedings of the National Academy of Sciences, 113(42), 11823–11828. doi: 10.1073/pnas.1607151113
  • Bail, C. A., Brown, T. W., & Mann, M. (2017). Channeling hearts and minds: Advocacy organizations, cognitive-emotional currents, and public conversation. American Sociological Review, 82(6), 1188–1213. doi: 10.1177/0003122417733673
  • Benoit, K., Watanabe, K., Wang, H., Nulty, P., Obeng, A., Müller, S., & Matsuo, A. (2018). Quanteda: An R package for the quantitative analysis of textual data. Journal of Open Source Software, 3(30), 774. doi: 10.21105/joss.00774
  • Blei, D. M. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77–84. doi: 10.1145/2133806.2133826
  • Blei, D. M., & Lafferty, J. D. (2007). A correlated topic model of science. The Annals of Applied Statistics, 1(1), 17–35. doi: 10.1214/07-AOAS114
  • Blok, A., & Pedersen, M. A. (2014). Complementary social science? Quali-quantitative experiments in a Big Data world. Big Data & Society, July-December, 1, 1–6.
  • Brady, S. R., Young, J. A., & Mcleod, D. A. (2015). Utilizing digital advocacy in community organizing: Lessons learned from organizing in virtual spaces to promote worker rights and economic justice. Journal of Community Practice, 23(2), 255–273.
  • Braun, V., & Clarke, V. (2006). Using thematic analysis in psychology. Qualitative Research in Psychology, 3(2), 77–101. doi: 10.1191/1478088706qp063oa
  • Clark, R. (2016). “Hope in a hashtag”: The discursive activism of# WhyIStayed. Feminist Media Studies, 16(5), 788–804. doi: 10.1080/14680777.2016.1138235
  • Dedoose. (2016). Web Application for Managing, Analyzing, and Presenting Qualitative and Mixed Method Research Data, Version 7.0., 23.
  • Denny, M. J., & Spirling, A. (2018). Text preprocessing for unsupervised learning: Why it matters, when it misleads, and what to do about it. Political Analysis, 26(2), 168–189. doi: 10.1017/pan.2017.44
  • Fan, W., & Bifet, A. (2013). Mining big data: Current status, and forecast to the future. ACM sIGKDD Explorations Newsletter, 14(2), 1–5. doi: 10.1145/2481244.2481246
  • Ferrara, E., Varol, O., Davis, C., Menczer, F., & Flammini, A. (2016). The rise of social bots. Communications of the ACM, 59(7), 96–104. doi: 10.1145/2818717
  • Flores, R. D. (2017). Do anti-immigrant laws shape public sentiment? A study of Arizona’s SB 1070 using Twitter data. American Journal of Sociology, 123(2), 333–384. doi: 10.1086/692983
  • Ford, H. (2014). Big data and small: Collaborations between ethnographers and data scientists. Big Data & Society. doi: 10.1177/2053951714544337
  • Georgakopoulou, A. (2017). Small stories research: A Narrative Paradigm for the Analysis of Social Media. In A. S. Quan-Haase, L (Ed.), Social media research methods (pp. 266–281). London: Sage.
  • Grimmer, J., & Stewart, B. M. (2013). Text as data: The promise and pitfalls of automatic content analysis methods for political texts. Political Analysis, 21(3), 267–297. doi: 10.1093/pan/mps028
  • Hale. H. (2017, July 26). “How much Data does the world generate every minute.” Retrieved from: http://www.iflscience.com/technology/how-much-data-does-the-world-generate-every-minute/.
  • Hsieh, H. F., & Shannon, S. E. (2005). Three approaches to qualitative content analysis. Qualitative Health Research, 15(9), 1277–1288. doi: 10.1177/1049732305276687
  • James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning (vol. 112). New York: springer.
  • Johnson, K. (2008). An overview of lexical semantics. Philosophy Compass, 3(1), 119–134. doi: 10.1111/j.1747-9991.2007.00101.x
  • Kaisler, S., Armour, F., Espinosa, J. A., & Money, W. (2013, January). Big data: Issues and challenges moving forward. In IEEE System sciences (HICSS), 2013 46th Hawaii international conference on (pp. 995–1004).
  • Kaplan, R. M. (2005). A method for tokenizing text. Festschrift in Honor of Kimmo Koskenniemi’s 60th Anniversary. CSLI Publications, Stanford, CA. Retrieved from http://csli-publications.stanford.edu/koskenniemi-festschrift/6-kaplan.pdf
  • Kass-Hout, T. A., & Alhinnawi, H. (2013). Social media in public health. British Medical Bulletin, 108(1), 5–24. doi: 10.1093/bmb/ldt028
  • Kitchin, R. (2017). Big data – Hype or revolution. In L. Sloan & A. Quan-Haase (Eds.), The Sage handbook of social media research methods (pp. 27–39). London: Sage.
  • Latzko-Toth, G., Bonneau, C., & Millette, M. (2017). Small data, Thick data: Thickening strategies for trace-based social media research. In L. Sloan & A. Quan-Haase (Eds.), The Sage handbook of social media research methods (pp. 199–214). Los Angeles: Sage.
  • Lazer, D., Pentland, A., Adamic, L., Aral, S., Barabasi, A.-L., Brewer, D., … Van Alstyne, M. (2009). Social science. Computational social science. Science (New York, N.Y.), 323(5915), 721doi: 10.1126/science.1167742
  • Lee, M., & Mimno, D. (2014). Low-dimensional embeddings for interpretable anchor-based topic inference. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1319–1328. Association for Computational Linguistics, Doha, Qatar. Retrieved from http://www.aclweb.org/anthology/D14-1138.
  • Lewis, D. D., Yang, Y., Rose, T. G., & Li, F. (2004). Rcv1: A new benchmark collection for text categorization research. Journal of Machine Learning Research, 5(Apr), 361–397.
  • Lewis, S. C., Zamith, R., & Hermida, A. (2013). Content analysis in an era of big data: A hybrid approach to computational and manual methods. Journal of Broadcasting and Electronic Media, 57(1), 34–52. doi: 10.1080/08838151.2012.761702
  • Liddy, E. D. (2001). Natural language processing. In Encyclopedia of library and information science, 2nd ed. NY: Marcel Decker, Inc.
  • Lukoianova, T., & Rubin, V. L. (2013). Veracity roadmap: Is big data objective, truthful and credible? Advances in Classification Research Online, 24(1), 4–15. doi: 10.7152/acro.v24i1.14671
  • McCay-Peet, L., & Quan-Haase, A. (2017). What is social media and what questions can social media research help us answer? In L. Sloan & A. Quan-Haase (Eds.), The Sage handbook of social media research methods (pp. 13–26). London: Sage.
  • McInerney, F., Doherty, K., Bindoff, A., Robinson, A., & Vickers, J. (2018). How is palliative care understood in the context of dementia? Results from a massive open online course. Palliative Medicine, 32(3), 594–602. doi: 10.1177/0269216317743433
  • Mehl, M. R. (2006). Quantitative text analysis. Handbook of multimethod measurement in psychology (141–156). Worcester, MA: American Psychological Association.
  • Moorhead, S. A., Hazlett, D. E., Harrison, L., Carroll, J. K., Irwin, A., & Hoving, C. (2013). A new dimension of health care: Systematic review of the uses, benefits, and limitations of social media for health communication. Journal of Medical Internet Research, 15(4), e85.
  • Murthy, D. (2017). The ontology of tweets: Mixed-methods approaches to the study of twitter. In L. Sloan & A. Quan-Haase (Eds.), The Sage handbook of social media research methods (pp. 559–572). London: Sage.
  • Pelletier, F. J. (1994). The principle of semantic compositionality. Topoi, 13(1), 11–24. doi: 10.1007/BF00763644
  • Pennebaker, J. W., Booth, R. J., Boyd, R. L., & Francis, M. E. (2015). Linguistic Inquiry and Word Count: LIWC2015. Austin, TX: Pennebaker Conglomerates (www.LIWC.net).
  • R Core Team. (2018). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Retrieved from https://www.R-project.org/.
  • Roberts, M., Stewart, B. M., & Tingley, D. (2016). stm: R Package for Structural topic Models. Journal of Statistical Software, 55(2), 1–42. doi: 10.18637/jss.v000.i00
  • Roberts, M. E., Stewart, B. M., & Tingley, D. (2015). STM: R package for structural topic models. R package version 1.1. 0.
  • Rodriguez, M. Y., Ostrow, L., & Kemp, S. P. (2017). Scaling up social problems. Research on Social Work Practice, 27(2), 139–149. doi: 10.1177/1049731516658352
  • Russom, P. (2011). Big data analytics. TDWI Best Practices Report, Fourth Quarter, 19(4), 1–34.
  • Saldaña, J. (2009). The coding manual for qualitative researchers. Thousand Oaks, CA: Sage Publications Ltd.
  • Salganik, M. J. (2017). Bit by bit: Social research in the digital age. Princeton: Princeton University Press.
  • Santillana, M., Nguyen, A. T., Dredze, M., Paul, M. J., Nsoesie, E. O., & Brownstein, J. S. (2015). Combining search, social media, and traditional data sources to improve influenza surveillance. PLoS Computational Biology, 11(10), e1004513. doi: 10.1371/journal.pcbi.1004513
  • Schroeder, R. (2014). Big data and the brave new world of social media research. Big Data & Society, 1(2). doi: 10.1177/2053951714563194
  • Schwandt, T. A. (2000). Three epistemological stances for qualitative inquiry: Interpretivism, hermeneutics, and social constructionism. Handbook of qualitative research (vol. 2, pp. 189–213). Thousand Oaks, CA: Sage Publications.
  • Schwemmer, C. (2018). Stminsights: A 'Shiny' Application for Inspecting Structural Topic Models. R package version 0.3.0. https://CRAN.R-project.org/package=stminsights
  • Sherraden, M., Barth, R. P., Brekke, J., Fraser, M. W., Mandersceid, R., & Padgett, D. K. (2015). Social is fundamental: Introduction and context for grand challenges for social work. Working paper.
  • Stewart, B. (2017). Twitter as method: Using Twitter as a tool to conduct research. In L. Sloan & A. Quan-Haase (Eds.), The Sage handbook of social media research methods (pp. 251–265). Los Angeles: Sage.
  • Stige, B., Malterud, K., & Midtgarden, T. (2009). Toward an Evaluation of Qualitative Research. Qualitative Health Research, 19(10), 1504–1516.
  • Storer, H. L., Rodriguez, M., & Franklin, R. (2018). “Leaving was a process, not an event”: The lived experience of dating and domestic violence in 140 characters. Journal of Interpersonal Violence, 886260518816325. doi: 10.1177/0886260518816325
  • Tinati, R., Halford, S., Carr, L., & Pope, C. (2014). Big data: Methodological challenges and approaches for sociological analysis. Sociology, 48(4), 663–681. doi: 10.1177/0038038513511561
  • Tufekci, Z. (2017). Twitter and tear gas: The power and fragility of networked protest. Yale University Press.
  • Wang, X., McCallum, A., & Wei, X. (2007). Topical n-grams: Phrase and topic discovery, with an application to information retrieval. In Seventh IEEE International Conference on Data Mining (ICDM 2007) (pp. 697–702).
  • Weller, K., Bruns, A., Burgess, J., Mahrt, M., & Puschman, C. (2014). Twitter and Society. New York: Peter Lang Inc.
  • Zeller, F. (2017). Analyzing social media data and other data sources: A methodological overview. In A. Quan-Haase & L. Sloan (Eds.), Social media research methods (pp. 386–404). London: Sage.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.