2,622
Views
1
CrossRef citations to date
0
Altmetric
Research Article

Ethical, political and epistemic implications of machine learning (mis)information classification: insights from an interdisciplinary collaboration between social and data scientists

ORCID Icon, ORCID Icon, ORCID Icon & ORCID Icon
Article: 2222514 | Received 06 Jun 2022, Accepted 05 Jun 2023, Published online: 07 Jul 2023

Figures & data

Figure 1. Steps in the construction of a ML misinformation detection model. (1) Problem definition: design of strategy based on hypotheses, definitions and theories about how to identify misinformation. (2) Selection of multimodal inputs and outputs to be included into the classification model. (3) Ground-truthing: the ground truth dataset is used to train the model. A subset of the this is reserved for model validation. (4) Model validation: a subset of the ground truth dataset is used to test the model’s performance. Metrics of performance accompany the publication of classification models. (5) Model deployment: model outputs inform online content moderation decisions such as banning, downranking or flagging. The dotted arrows represent feedback loops between steps.

Figure 1. Steps in the construction of a ML misinformation detection model. (1) Problem definition: design of strategy based on hypotheses, definitions and theories about how to identify misinformation. (2) Selection of multimodal inputs and outputs to be included into the classification model. (3) Ground-truthing: the ground truth dataset is used to train the model. A subset of the this is reserved for model validation. (4) Model validation: a subset of the ground truth dataset is used to test the model’s performance. Metrics of performance accompany the publication of classification models. (5) Model deployment: model outputs inform online content moderation decisions such as banning, downranking or flagging. The dotted arrows represent feedback loops between steps.

Table 1. A summary of cautions and contingencies in the development of ML misinformation classification models and their implications.