Abstract
This paper gives an overview and major components of our speech understanding system, and explains the underlying ideas.
Our speech understanding system is based on the idea that since we can locate some phones very accurately in the incoming speech, an interval bounded by such robust phones should be taken as the unit of matching. This eliminates the problem of ambiguous word boundaries in searching the most likely sentence, as such intervals have little correspondence to word boundaries. We introduced a new processing level called a partial lattice hypothesis level to realize the above idea in our hierarchical SUS. The typical word pronunciation and their varieties are precompiled in a lattice form, while modifications at word boundaries are performed by applying the phonological rules from time to time.
Another feature of our SUS is an efficient word predictor based on the bottom up parsing.
Indexing terms: