198
Views
19
CrossRef citations to date
0
Altmetric
Original

Predicting phonetic transcription agreement: Insights from research in infant vocalizations

, &
Pages 793-831 | Received 20 Apr 2007, Accepted 11 Jun 2007, Published online: 09 Jul 2009
 

Abstract

The purpose of this study is to provide new perspectives on correlates of phonetic transcription agreement. Our research focuses on phonetic transcription and coding of infant vocalizations. The findings are presumed to be broadly applicable to other difficult cases of transcription, such as found in severe disorders of speech, which similarly result in low reliability for a variety of reasons. We evaluated the predictiveness of two factors not previously documented in the literature as influencing transcription agreement: canonicity and coder confidence. Transcribers coded samples of infant vocalizations, judging both canonicity and confidence. Correlation results showed that canonicity and confidence were strongly related to agreement levels, and regression results showed that canonicity and confidence both contributed significantly to explanation of variance. Specifically, the results suggest that canonicity plays a major role in transcription agreement when utterances involve supraglottal articulation, with coder confidence offering additional power in predicting transcription agreement.

Notes

1. Throughout this paper the terms “transcription reliability” and “transcription agreement” are used interchangeably, although “agreement” is, in principle, a type of “reliability”. Cucchiarini (Citation1996) clarifies the distinction.

2. The criterion for judgement of canonicity in infant vocalizations has always been (and to the present remains) primarily auditory. The methods section provides a description of the auditory judgement procedure that has been used in the second author's laboratories for years. A primary reason that we continue to focus on auditory rather than instrumental acoustic judgements in this research is that the definition of canonical syllables in acoustic terms is still not yet fully established. We here offer a brief summary of the acoustic status of the definition.

 A criterion duration for formant transitions has been nominally established based upon acoustic examination of relatively measurable formant transitions in auditorily judged canonical and non‐canonical syllables from infants, as summarized in Oller (Citation2000a). Measurement of formant transitions in infant vocalizations can be extremely difficult, especially because high pitch of many infant syllables produces harmonics that are very widely spread. The nominal criterion based on relatively measurable transitions is 120 ms (usually focusing on F2) as a maximum for canonical syllables. This value is primarily based on infant syllables where both F1 and F2 have been reliably visible in spectrographic displays with at least 600 Hz analysis bandwidth, and where F1 and F2 vary from a consonantal locus to a nuclear (vowel) locus and then reverse slope. The end of the formant transitions can thus be referenced to a steady state or a reversal of slope.

 However, beyond the nominal durational criterion, it is clear that to attain a generally applicable acoustic definition of canonical syllables, additional specification is needed to account for differing types of syllables and differing utterance‐level patterns. For example, syllables with nasal or aspirated consonants often show extremely short formant transitions in acoustic displays, and we are investigating the utility of amplitude rise time as a possible substitute for or supplement to formant transition duration as a criterion for canonicity in such cases. Also, at slow speaking rates the maximum transition duration may need to be higher than at rapid rates.

 A transition slope criterion is also obviously required (because if slope is too low, change in formant frequency would be heard as no change). The slope data on dysarthric patients from Kent et al. (Citation1989) focused on circumstances where F2 appeared to be a useful focus for determining a criterion for intelligible syllables: If average F2 transition slopes were lower than a ratio of 2.5 (Hz ms−1), speakers proved to be highly unintelligible. However, there are of course canonical syllable types where F2 slope is inherently low, i.e. if F2 locus for a consonant is near the F2 target for its adjacent vowel. So slopes of other formants (F1 and/or F3) may need to be referenced to determine canonicity in such cases. Further, the slope criteria for canonical transitions suggested by the adult F2 data would presumably need to be normalized for infant formant values which are known to vary widely from those of adults. As research proceeds towards the development of a more elaborate and finely tuned acoustic definition of the notion canonical syllable, it will have to make reference to a wide variety of acoustic facts, but the success of the approach will always need to be referenced to auditory judgements of real listeners about well‐formedness. In the meantime, auditory judgements remain at centre stage in the judgement of canonicity.

3. There was, in fact, wide variation in transcriber criteria for the assessment of canonicity. The mean utterance canonicity value for the 8 transcribers ranged from .90 to .25. This range could presumably have been limited by training to specific criteria of canonicity judgement based on work with many exemplars of infant utterances. However, it was our goal to assess intuitive responses both in terms of canonicity and confidence judgements. Hence, coders made their own decisions about how best to interpret the canonicity definition after it was provided along with a few example utterances during the training period. In the future we hope to conduct research to compare reliability for canonicity judgements in three circumstances: (a) as in the present work, with minimal criterion setting through training, (b) with much more rigorous training to limit the variation in criteria among coders, and (c) with purely instrumental acoustic canonicity judgements. Approach (c) will only be possible to implement after further specification of acoustic criteria for canonicity (see note 2).

4. The utterance canonicity judgements for this t‐test analysis were not identical to the ones utilized in the correlational analyses. For the t‐test analysis we sought to indicate lack of canonicity in terms of articulatory transitions that auditorily seemed particularly disruptive to the rhythmic structure of whole utterances. In contrast the judgements for the correlational analyses were made at the level of the segment with no particular attention to the utterance as a whole.

5. In a separate descriptive analysis utilizing the segment‐by‐segment judgements from the correlational analyses, we split the distribution by utterances with canonicity values above the mean and those with values below the mean. Results were similar to those reported in the main text for utterances categorized as canonical or non‐canonical—in the split plot analysis, utterances with canonicity values above the mean had a similar and slightly larger advantage in transcription agreement over utterances with canonicity values below the mean (.65 vs .45, respectively).

6. The 7 comparator coders utilized a variety of phonetic symbols corresponding to sounds not occurring in American English in addition to those listed for the standard coder. Among the standard coder's non‐native symbols were many that occurred multiple times across the comparator coders. The additional non‐native symbols not utilized by the standard coder were used very infrequently across the other coders, the great bulk of them exactly once. To simplify the LIPP analysis program (which would have had to be elaborated greatly to incorporate all the infrequently occurring symbols as non‐natives), we resolved to key the analysis on the standard coder's utilization of non‐native symbols.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 65.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 484.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.