674
Views
10
CrossRef citations to date
0
Altmetric
Articles

When Scale Meets Depth: Integrating Natural Language Processing and Textual Analysis for Studying Digital Corpora

Pages 28-50 | Published online: 24 Mar 2016
 

Abstract

As computer-assisted research of voluminous datasets becomes more pervasive, so does the criticism of its epistemological, methodological, and ethical/normative inadequacies. This article proposes a hybrid approach that combines the scale of computational methods with the depth of qualitative analysis. It uses simple natural language processing algorithms to extract purposive samples from large textual corpora, which can then be analyzed using interpretive techniques. This approach helps research become more theoretically grounded and contextually sensitive—two major failings of typical “Big Data” studies. Simultaneously, it allows qualitative scholars to examine datasets that are otherwise too large to study manually and also bring more rigor to the process of sampling. The method is illustrated with two case studies, one looking at the inaugural addresses of U.S. presidents and the other investigating the news coverage of two shootings at an army camp in Texas.

Notes

1 In recent years, panels and sessions focusing on Big Data scholarship and pedagogy have become commonplace at academic conferences in the disciplines of journalism and communication studies, including annual conventions of the International Communication Association (ICA) and the Association for Education in Journalism and Mass Communication (AEJMC). Several journals in these and other social scientific disciplines have published special issues on or related to Big Data research, such as The ANNALS of The American Academy of Political and Social Science (Citation2015), Digital Journalism (2015), International Journal of Communication (2014), and Journal of Broadcasting and Electronic Media (2013).

2 Python is a general-purpose programming language that uses codes which are short, simple and highly readable. It can be downloaded from www.python.org for a number of operating systems. NLTK has several customized algorithms that make working with Python easier. It can be downloaded from www.nltk.org. Bird, Klein & Loper’s (Citation2009) book, Natural Language Processing with Python, is recommended for scholars interested in learning how to use Python with NLTK. It is available online at www.nltk.org/book.

3 “C:\Users\Desktop” is a Windows file path. For Mac users, the corresponding file path will be “/user/desktop.”

4 Scholars who want to learn more about concordance may look at Section 1.3 of the first chapter of Natural Language Processing with Python, available online (www.nltk.org/book/ch01.html). The section is called “Searching Text.”

5 Scholars who want to learn more about regular expression may look at Section 3.4 of the third chapter of Natural Language Processing with Python, available online (http://www.nltk.org/book/ch03.html). The section is called “Regular Expressions for Detecting Word Patterns.”

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 258.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.