Abstract
This article describes a synthesizer of disordered voices and reports a test of the reliability of Grade, Roughness, and Breathiness scores assigned to synthetic stimuli by eight expert listeners in two sessions. Speech stimuli [a], [i], [u], [ai], and [ia] were synthesized with three values of vocal frequency and four levels of vocal jitter and pulsatile additive noise each. The agreement and correlation of scores assigned by the same rater in different sessions, or by different raters in the same session, accord with published data. Only a small part of the variance of the arithmetic differences between the scores that are assigned to the same stimulus is explained by the stimuli properties. The conclusion is that differences between scores that are assigned to the same stimulus are not attributable to biases of individual raters; such biases would shift all the scores assigned on a scale, and the shift would be interpretable in terms of the properties of the stimuli.
Declaration of interest: The authors report no conflicts of interest.