Title: "Misrecognitions, Corrections and Prosody in Spoken Dialogue Systems"


Diane Litman
AT&T Labs-Research


Abstract:

Understanding how people speak when they interact with spoken dialogue systems is critical to improving the performance of those systems. In particular, speakers' prosodic behavior provides useful indicators of a) whether a speaker turn will be recognized correctly or not by an automatic speech recognition system; b) whether a speaker is reacting to a system error; and c) whether a speaker is correcting such an error. This talk presents results of analyses of human interactions with the TOOT spoken dialogue system, an experimental system for accessing train schedules by phone. Our analytic results show that there are significant prosodic and lexical differences between misrecognized and correctly recognized speech and between correction and non-correction utterances. Our machine learning results show that prosodic and other differences can in fact be used to automatically predict both misrecognitions and their corrections. We suggest how such results may be used to improve system behavior in dialogue systems.