Speech technology in interactive dialogue systems

Main responsibility for the research area:
Prof Rolf Carlson

The concentration effort of CTT on speech-based dialogue systems has a wider and more specific aim than simply testing new models of speech synthesis and speech recognition. Rather, the aim is to develop speech technology as part of a complete interactive human-machine system, which also includes dialogue management. Among other things, this integration will imply that knowledge about language and dialogue models will help to enhance the quality of the speech synthesis as well as facilitate speech recognition by focusing it on predictable interpretations.

Portability is an important aim in all CTT areas. Dialogue system tools are now modularized in an (almost) platform-independent manner. This gives us the possibility of integrating components flexibly, even if they are not developed at CTT. We intend to realize the technology by means of demonstrators for different areas of application, e.g., multimodal dialogue systems and in wireless, mobile systems. In the latter case, the terminal is often small. It is often used in a setting where keyboards and/or visual displays are impractical. Speech technology, however, can manage the input and output functions in an elegant manner.

We see the demand for speech technology in the form of a pocket-size ”personal assistant”, which should be no larger than the smallest mobile phones on the market but with an increased range of functions; examples might be information extraction, access control or the possibility of using it as a generalized remote control.

Important Research Topics for Stage 3

Some of the topics listed below will be further developed in order to demonstrate and evaluate research at CTT.
The choice of additional domains will be made in consultation with the CTT partners.

  • multimodal conversational dialogue systems
  • telephone-based dialogue systems for extracting information and performing speaker verification
  • methods for second language acquisition utilizing speech technology

