Seminar at Speech, Music and Hearing:
Error handling in spoken dialogue systems – Managing uncertainty, grounding and miscommunication
Opponent: Alexander Rudnicky, Carnegie Mellon University, USA
AbstractDue to the large variability in the speech signal, the speech recognition
process constitutes the major source of errors in most spoken dialogue
systems. A spoken dialogue system can never know for certain what the user
is saying, it can only make hypotheses. As a result of this uncertainty, two
types of errors can be made: over-generation of hypotheses, which leads to
misunderstanding, and under-generation, which leads to non-understanding. In
human-human dialogue, speakers try to minimise such miscommunication by
constantly sending and picking up signals about their understanding, a
process commonly referred to as grounding.
The topic of this thesis is how to deal with this uncertainty in spoken
dialogue systems: how to detect errors in speech recognition results, how to
recover from non-understanding, how to choose when to engage in grounding,
how to model the grounding process, how to realise grounding utterances and
how to detect and repair misunderstandings. The approach in this thesis is
to explore and draw lessons from human error handling, and to study how
error handling may be performed in different parts of a complete spoken
dialogue system. These studies are divided into three parts.
In the first part, an experimental setup is presented in which a speech
recogniser is used to induce errors in human-human dialogues. The results
show that, unlike the behaviour of most dialogue systems, humans tend to
employ other strategies than encouraging the interlocutor to repeat when
faced with non-understandings. The collected data is also used in a
follow-up experiment to explore which factors humans may benefit from when
detecting errors in speech recognition results. Two machine learning
algorithms are also used for the task.
In the second part, the spoken dialogue system Higgins is presented,
including the robust semantic interpreter Pickering and the error aware
discourse modeller Galatea. It is shown how grounding is modelled and error
handling is performed on the concept level. The system may choose to display
its understanding of individual concepts, pose fragmentary clarification
requests, or risk a misunderstanding and possibly detect and repair it at a
later stage. An evaluation of the system with naive users indicates that the
system performs well under error conditions.
In the third part, models for choosing when to engage in grounding and how
to realise grounding utterances are presented. A decision-theoretic,
data-driven model for making grounding decisions is applied to the data from
the evaluation of the Higgins system. Finally, two experiments are
presented, which explore how the intonation of synthesised fragmentary
grounding utterances affect their pragmatic meaning.
The primary target of this thesis is the management of uncertainty,
grounding and miscommunication in conversational dialogue systems, which to
a larger extent build upon the principles of human conversation. However,
many of the methods, models and results presented should also be applicable
to dialogue systems in general.
10:00 - 13:00
Friday November 23, 2007
The seminar is held in Sal F3, Lindstedtsvägen 26.
| Show complete seminar list