68
Views
4
CrossRef citations to date
0
Altmetric
ORIGINAL ARTICLE

Does the acoustic waveform mirror the voice?

Pages 100-107 | Published online: 11 Jul 2009
 

Abstract

Over recent decades, much effort has been invested in the search for acoustic correlates of vocal function and dysfunction. The convenience of non-invasive voice measurements has been a major incentive for this effort. The acoustic signal is a rich but also very diversified source of information. Computer literacy and technical curiosity in the voice care and voice performance communities are now higher than ever, and tools for voice analysis are proliferating. On such a busy scene, a review may be useful of some basic principles for what we can and cannot hope to determine from non-invasive acoustic analysis. One way of doing this is to consider communication by voice as though it were engineered, with layered protocols. This results in a scheme for systematizing the many sources of variation that are present in the acoustic signal, that can complement other strategies for extracting information.

Notes

1. Voice timbre usually has an expressive rather than a semantic function. A few languages employ timbral attributes such as press or breathiness to convey meaning. This complication can be negotiated by declaring that, to the extent that timbre affects the semantics, it is an attribute of phonemes.

2. At this level, we do not separate the phonemes according to their phonetic subclasses or their acoustic features; they are all viewed as symbols in the script of the semantic stream.

3. A single value, if we are dealing with monophonic signals. Spatial sound information would require more than one acoustic channel, but this seems to be of little relevance, so long as a voice can be approximated by a point source.

4. The word ‘level’ is uncomfortably overloaded, in that it has precise but different meanings in acoustics, in factor experiments, in physiology, and so on. In scientific papers on voice, several of these meanings are known to have occurred in the same paragraph; the reader is herewith cautioned. In acoustics and in telecommunications, a level is the logarithm of a ratio of an observed power to a reference power. The level is expressed in decibels; see any textbook for details. The word ‘intensity’, too, has a technical meaning in acoustics (power per unit area: watts per square meter), but unfortunately this meaning is rarely upheld in the voice and speech literature. The word intensity has variously been used as synonymous with SPL, or loudness, or vocal effort, which are all different things.

5. I write the short-term average here, because while most methods do not localize the precise moments of glottal excitation, the sum of period times obtained over a running time window is usually quite accurate.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.