472
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Building Better Machine Learning Models for Rhetorical Analyses: The Use of Rhetorical Feature Sets for Training Artificial Neural Network Models

Pages 63-78 | Published online: 23 May 2022
 

ABSTRACT

In this paper, we investigate two approaches to building artificial neural network models to compare their effectiveness for accurately classifying rhetorical structures across multiple (non-binary) classes in small textual datasets. We find that the most accurate type of model can be designed by using a custom rhetorical feature list coupled with general-language word vector representations, which outperforms models with more computing-intensive architectures.

Acknowledgment

The authors would like to thank the reviewers and the editor for valuable critique, guidance, and encouragement, the managing editor for the helpful and insightful edit of our manuscript, and the Center for Computationally Assisted Science and Technology (CCAST) at North Dakota State University for providing computing resources and support.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

1. There are a lot of resources for learning more about word embeddings. Latysheva (Citation2019) provides a brief introduction to the topic, and Karani (Citation2018) and Sarwan (Citation2017) provide more detailed ones.

2. Rhetorical features can be understood as any language structure that exerts a rhetorical effect within a given rhetorical ecosystem. Broadly, such features can be determined through rhetorical analysis based on close reading of select texts, or quantitatively through tools like DocuScope’s generic dictionary of rhetorical features (Kaufer, Ishizaki, Butler, & Collins, Citation2004) or DICTION’s semantic feature/sub-feature sets (refer to Hart, Citation2001). Later in this article, we offer one approach to compiling a rhetorical feature list, using a set of specific steps.

3. In concept, if not necessarily in practice. It is best to think of ANNs as inspired by neuroscientific knowledge of the brain but not trying to perfectly mimic the functions of actual brains. Refer also to the discussion on backpropagation in Ananthaswamy (Citation2021).

4. Unsupervised ML algorithms, like topic models or one-shot learning algorithms, do not use pre-labeled datasets; this study focuses on supervised models.

5. A detailed description of each category is included in the dataset’s repository, specifically, under the coding manual (https://kilthub.cmu.edu/articles/dataset/E-thos_Project_Climate_Change/12964481/1?file=24696185).

6. For more detail on nuances like activation functions or output dimensions we used, our code can be accessed at https://osf.io/6sbcq/. Parts of the BERT code were built from https://www.kdnuggets.com/2020/02/intent-recognition-bert-keras-tensorflow.html; writing non-BERT code benefitted from the documentation at https://keras.io/examples/nlp/pretrained_word_embeddings/.

7. For two excellent illustrations of how BERT works, refer to Alammar (Citation2018) and Vig (Citation2019).

8. Our columnar distinctions between categories in merely serve to make the list of seed words more human-readable; from the ANN’s perspective, they appear as one long list without the superimposed class distinctions in . We used our training in rhetorical analysis to identify what we judged to be rhetorical signals for these classes of expertise appeal, but we cannot know whether the ANN algorithms used these features in the same ways or what patterns emerging from them were most predictive of the class distinctions made by the ANNs.

9. We also used BERT in Set 3 with the custom feature list; predictably, those models performed poorly and occasionally could not differentiate between different classes at all.

10. Their feature selection differed significantly from ours, however, as they used n-grams rather than custom feature sets. Their data – technical manuals – targeted a different, more regulated and less unstructured type of TPC than our data, likely favoring a syntactically driven feature selection process over our semantically and rhetorically driven one. Running our model with a similar approach as theirs produced average results.

11. We concatenated our 100-dimensional GloVe vector representations with the syntactic features, effectively adding syntactic dimensions to the word vector representation.

Additional information

Funding

The Center for Computationally Assisted Science and Technology (CCAST) resources at North Dakota State University were made possible in part by NSF MRI Award No. 2019077.

Notes on contributors

Zoltan P. Majdik

Zoltan P. Majdik is an associate professor in the Department of Communication at North Dakota State University in Fargo, ND.

James Wynn

James Wynn is an associate professor of English at Carnegie Mellon University in Pittsburgh, PA.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 212.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.