Abstract
Since the end of the last century, the automatic processing of paralinguistics has been investigated widely and put into practice in many applications, on wearables, smartphones, and computers. In this contribution, we address ethical awareness for paralinguistic applications, by establishing taxonomies for data representations, system designs for and a typology of applications, and users/test sets and subject areas. These are related to an “ethical grid” consisting of the most relevant ethical cornerstones, based on principalism. The characteristics of and the interdependencies between these taxonomies are described and exemplified. This makes it possible to assess more or less critical “ethical constellations.” To the best of our knowledge, this is the first attempt of its kind.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Notes
1 Throughout this article, boldface denotes all those terms that are displayed in and , and italics indicates other important terms.
2 Note that we are not concerned with implementing ethical theories in machines (Tolmeijer et al., Citation2021) but with systematically assessing ethical awareness in applications.
3 Our taxonomies are more concrete than those in Chancellor et al. (Citation2019) which as well could be called “aspects of”.
4 UAR stands for Unweighted Average Recall, i.e., the average of the values (true positives in percent) in the diagonal of a confusion matrix, for n = 2 or more classes.
5 Self-assessment is usually taken as being closer to the ground truth; yet, all types of assessment are no real ground truth but filtered.
6 EU Proposal for a Regulation laying down harmonised rules on artificial intelligence (Artificial Intelligence Act); https://digital-strategy.ec.europa.eu/en/library/proposal-regulation-laying-down-harmonised-rules-artificial-intelligence-artificial-intelligence.
8 Different constellations of types of users and types of data recorded, with more generic ones being less prone to violations of privacy, are discussed in Batliner et al. (Citation2020).
9 We use the term application in its broader meaning that is not limited to market-ready “apps” like a smartphone app but also includes use cases in which CP systems are employed.
11 Proposal for a Regulation laying down harmonised rules on artificial intelligence (Artificial Intelligence Act); https://digital-strategy.ec.europa.eu/en/library/proposal-regulation-laying-down-harmonised-rules-artificial-intelligence-artificial-intelligence.
Additional information
Funding
Notes on contributors
Anton Batliner
Anton Batliner received his doctoral degree in Phonetics in 1978 from LMU Munich. He is now with the Chair of Embedded Intelligence for Health Care and Wellbeing, University of Augsburg, Germany. His main research interests are all (cross-linguistic) aspects of prosody and (computational) paralinguistics.
Michael Neumann
Michael Neumann is a research scientist at Modality.AI. He received his doctoral degree in 2022 from the Institute for Natural Language Processing, University of Stuttgart, Germany. His expertise and main research interests center around computational paralinguistics and machine learning for speech and language processing.
Felix Burkhardt
Felix Burkhardt does teaching, consulting, research and development with respect to emotional and speech based human-machine interfaces. Originally an expert of Speech Synthesis at the Technical University of Berlin, he worked for 18 years with Deutsche Telekom. Since 2018 he is the research director at audEERING GmbH.
Alice Baird
Alice Baird has interdisciplinary expertise in machine learning, computational paralinguistics, stress, and emotional well-being. She completed her PhD at the University of Augsburg’s Chair of Embedded Intelligence for Health Care and Wellbeing in 2021 and recently joined Hume AI as an AI research scientist.
Sarina Meyer
Sarina Meyer is currently pursuing a PhD degree at the Institute of Natural Language Processing at Stuttgart University, Germany. Her research interests focus on privacy and explainability in speech processing.
Ngoc Thang Vu
Ngoc Thang Vu is a Full Professor of Digital Phonetics, Endowed Professor of Carl-Zeiss-Foundation at the Institute of Natural Language Processing, University of Stuttgart. He received his diploma and doctoral degree in computer science from KIT in Karlsruhe. His main research interests are speech processing, dialogue systems and machine learning.
Björn W. Schuller
Björn W. Schuller received his diploma, doctoral degree, habilitation, and Adjunct Teaching Professor from TUM in Munich/Germany. He is Full Professor of Artificial Intelligence at Imperial College London/UK, Full Professor and Chair of Embedded Intelligence for Health Care and Wellbeing at the University of Augsburg/Germany, and co-founding CEO/CSO of audEERING.