68
Views
4
CrossRef citations to date
0
Altmetric
ORIGINAL ARTICLE

Does the acoustic waveform mirror the voice?

Pages 100-107 | Published online: 11 Jul 2009
 

Abstract

Over recent decades, much effort has been invested in the search for acoustic correlates of vocal function and dysfunction. The convenience of non-invasive voice measurements has been a major incentive for this effort. The acoustic signal is a rich but also very diversified source of information. Computer literacy and technical curiosity in the voice care and voice performance communities are now higher than ever, and tools for voice analysis are proliferating. On such a busy scene, a review may be useful of some basic principles for what we can and cannot hope to determine from non-invasive acoustic analysis. One way of doing this is to consider communication by voice as though it were engineered, with layered protocols. This results in a scheme for systematizing the many sources of variation that are present in the acoustic signal, that can complement other strategies for extracting information.

Notes

1. Voice timbre usually has an expressive rather than a semantic function. A few languages employ timbral attributes such as press or breathiness to convey meaning. This complication can be negotiated by declaring that, to the extent that timbre affects the semantics, it is an attribute of phonemes.

2. At this level, we do not separate the phonemes according to their phonetic subclasses or their acoustic features; they are all viewed as symbols in the script of the semantic stream.

3. A single value, if we are dealing with monophonic signals. Spatial sound information would require more than one acoustic channel, but this seems to be of little relevance, so long as a voice can be approximated by a point source.

4. The word ‘level’ is uncomfortably overloaded, in that it has precise but different meanings in acoustics, in factor experiments, in physiology, and so on. In scientific papers on voice, several of these meanings are known to have occurred in the same paragraph; the reader is herewith cautioned. In acoustics and in telecommunications, a level is the logarithm of a ratio of an observed power to a reference power. The level is expressed in decibels; see any textbook for details. The word ‘intensity’, too, has a technical meaning in acoustics (power per unit area: watts per square meter), but unfortunately this meaning is rarely upheld in the voice and speech literature. The word intensity has variously been used as synonymous with SPL, or loudness, or vocal effort, which are all different things.

5. I write the short-term average here, because while most methods do not localize the precise moments of glottal excitation, the sum of period times obtained over a running time window is usually quite accurate.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 65.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 236.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.