Lexical Access for Speech Understanding using Minimum Message Length Encoding
This work addresses speech understanding for applications requiring accurate word recognition from phonemes, though it appears incremental as it builds on existing information-theoretic methods.
The paper tackles the Lexical Access Problem by determining word sequences from phoneme inputs using a Minimum Message Length encoding approach, achieving results on multi-speaker continuous speech with a heuristic that speeds up recognition without significant accuracy loss.
The Lexical Access Problem consists of determining the intended sequence of words corresponding to an input sequence of phonemes (basic speech sounds) that come from a low-level phoneme recognizer. In this paper we present an information-theoretic approach based on the Minimum Message Length Criterion for solving the Lexical Access Problem. We model sentences using phoneme realizations seen in training, and word and part-of-speech information obtained from text corpora. We show results on multiple-speaker, continuous, read speech and discuss a heuristic using equivalence classes of similar sounding words which speeds up the recognition process without significant deterioration in recognition accuracy.