Browse
We’re here to help

Find guidance on Author Services

Search
Browse
We’re here to help

Find guidance on Author Services

Home
All Journals
Communication Methods and Measures
List of Issues
Volume 17, Issue 2
Investigating Opinions on Public Policie ....

Search in:

Advanced search

Communication Methods and Measures Volume 17, 2023 - Issue 2

Submit an article Journal homepage

848

Views

CrossRef citations to date

Altmetric

Research Article

Investigating Opinions on Public Policies in Digital Media: Setting up a Supervised Machine Learning Tool for Stance Classification

Christina Viehmanna Department of Communication, Johannes Gutenberg University MainzCorrespondence[email protected]

https://orcid.org/0000-0001-6673-0987 View further author information

Tilman Beckb Ubiquitous Knowledge Processing (UKP) Lab, Technical University Darmstadt

https://orcid.org/0000-0002-1403-8240 View further author information

Marcus Maurera Department of Communication, Johannes Gutenberg University MainzView further author information

Oliver Quiringa Department of Communication, Johannes Gutenberg University MainzView further author information

Iryna Gurevychb Ubiquitous Knowledge Processing (UKP) Lab, Technical University DarmstadtView further author information

Pages 150-184 | Published online: 12 Dec 2022

Cite this article
https://doi.org/10.1080/19312458.2022.2151579
CrossMark

Sample our Communication Studies journals, sign in here to start your access, 2013 & 2014 volumes FREE to you for 14 days

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions
Read this article /doi/full/10.1080/19312458.2022.2151579?needAccess=true

ABSTRACT

Supervised machine learning (SML) provides us with tools to efficiently scrutinize large corpora of communication texts. Yet, setting up such a tool involves plenty of decisions starting with the data needed for training, the selection of an algorithm, and the details of model training. We aim at establishing a firm link between communication research tasks and the corresponding state-of-the-art in natural language processing research by systematically comparing the performance of different automatic text analysis approaches. We do this for a challenging task – stance detection of opinions on policy measures to tackle the COVID-19 pandemic in Germany voiced on Twitter. Our results add evidence that pre-trained language models such as BERT outperform feature-based and other neural network approaches. Yet, the gains one can achieve differ greatly depending on the specific merits of pre-training (i.e., use of different language models). Adding to the robustness of our conclusions, we run a generalizability check with a different use case in terms of language and topic. Additionally, we illustrate how the amount and quality of training data affect model performance pointing to potential compensation effects. Based on our results, we derive important practical recommendations for setting up such SML tools to study communication texts.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data availability statement

The data supporting this research is not publicly available due to ethical restrictions: The terms of use of Twitter do not allow for publicly sharing original Tweets nor did Tweet authors decidedly agree to provide their Tweets for research and to publicly share them outside of Twitter. In the corresponding paper by Beck et al. (Citation2021), we only release the set of identifiers (Tweet IDs) for the texts used in this research project. Thereby, we adhere to the Twitter Developer policy and give users full control of their privacy and data as they can delete or privatize Tweets so that they cannot be collected. The code for the model set up in the paper at hand is publicly shared on the Github repository: https://github.com/UKPLab/cmm2022-stance-covid19

Notes

¹ We use the words to annotate and to code as synonyms since both refer to the procedure of assigning certain labels to text.

² Other examples for features are meta data such as text length or features based on the sentence structure.

³ This context window can, for example, capture the four or five words surrounding the one of interest. The size of this window is set during training the word embeddings.

⁴ https://code.google.com/archive/p/word2vec/.

⁵ To take into account both the left and the right context of a word, it is trained from left-to-right and right-to-left.

⁶ To overcome technical challenges associated with bi-directional training using the transformer architecture, Devlin et al. (Citation2019) instead propose to mask some percentage of input words randomly and then predict those tokens.

⁷ The Huggingface Library provides plenty of pre-trained language models trained for very different areas of application, https://huggingface.co/.

⁸ We provide the full list of filter terms in Appendix A in .

⁹ https://github.com/carpedm20/emoji.

¹⁰ https://pypi.org/project/tweet-preprocessor/.

¹¹ https://www.deepset.ai/.

¹² https://www.mturk.com/.

¹³ https://www.figure-eight.com/.

Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019, June). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: Human language technologies,volume 1 ( long and short papers) (pp. 4171–4186). Minneapolis, Minnesota: Association for Computational Linguistics. https://aclanthology.org/N19-1423

Google Scholar

Additional information

Funding

This work has been funded by the German Research Foundation (DFG) as part of the Research Training Group KRITIS No. GRK 2222.

Notes on contributors

Christina Viehmann

Dr. Christina Viehmann ([email protected]) is a postdoctoral researcher at the Department of Communication at the University of Mainz, Germany.

Tilman Beck

Tilman Beck ([email protected]) is a doctoral candidate at the Ubiquitous Knowledge Processing (UKP) Lab as part of the Computer Science Department at the Technical University of Darmstadt.

Marcus Maurer

Prof. Dr. Marcus Maurer ([email protected]) is a full professor at the Department of Communication at the University of Mainz.

Oliver Quiring

Prof. Dr. Oliver Quiring ([email protected]) is a full professor at the Department of Communication at the University of Mainz.

Iryna Gurevych

Prof. Dr. Iryna Gurevych ([email protected]) is a full professor at the Ubiquitous Knowledge Pro-cessing (UKP) Lab as part of the Computer Science Department at the Technical University of Darmstadt.

Log in via your institution

Access through your institution

Log in to Taylor & Francis Online

Shibboleth

Log in to Taylor & Francis Online

Username Password

Forgot password?

Keep me logged in (not suitable for shared devices).

You will otherwise be logged out automatically, after a limited period, and will need to log in again.

Restore content access

Restore content access for purchases made as guest

Purchase options * Save for later Item saved, go to cart

PDF download + Online access

48 hours access to article PDF & online version
Article PDF can be downloaded
Article PDF can be printed

USD 53.00 Add to cart

PDF download + Online access - Online Checkout

Issue Purchase

30 days online access to complete issue
Article PDFs can be downloaded
Article PDFs can be printed

USD 258.00 Add to cart

Issue Purchase - Online Checkout

* Local tax will be added as applicable

Share icon
Back to Top

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Information for

Authors
R&D professionals
Editors
Librarians
Societies

Open access

Overview
Open journals
Open Select
Dove Medical Press
F1000Research

Opportunities

Reprints and e-prints
Advertising solutions
Accelerated publication
Corporate access solutions

Help and information

Help and contact
Newsroom
All journals
Books

Keep up to date

Sign me up

Taylor and Francis Group Facebook page

Taylor and Francis Group X Twitter page

Taylor and Francis Group Linkedin page

Taylor and Francis Group Youtube page

Taylor and Francis Group Weibo page

Registered in England & Wales No. 3099067
5 Howick Place | London | SW1P 1WG

Your download is now in progress and you may close this window

Did you know that with a free Taylor & Francis Online account you can gain access to the following benefits?

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Have an account?
Login now Don't have an account?
Register for free

Login or register to access this feature

Have an account?
Login now Don't have an account?
Register for free

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research