Language models for spoken language, including dialogue models

Main responsibility for the research area:
Associate Prof David House

Language models for spoken language play an important role in the field of speech technology. For instance, high-quality linguistic analysis significantly improves synthetic speech. Likewise, prosodic models are used to signal phrasing and focus in a particular text. Similarly, a grammatical model removes undesirable word combinations from a speech recognition system, thus functioning as a linguistic filter.

General language models are often too unrestricted to fulfill the goal of a linguistic filter. This is particularly the case in existing experimental dialogue systems, whose applications are often restricted to a specific domain. Therefore, portability is an overall ambition of the work at CTT. Furthermore, within this subarea, the aim of CTT is the creation of generic models that can be adapted to special applications virtually automatically; this is a challenge of great industrial relevance.

In research efforts in the current international arena, syntactic analysis is supplemented with semantic and pragmatic analysis. The more specialized the application or system task, the more powerful the influence and support of the analysis by these models.

Since language models cannot be based solely on theoretical descriptions, it is often the case internationally, that models described with explicit rules are fully or partially substituted for models based on statistical analysis of large speech and language corpora. Fundamental within the research of CTT, then, is the understanding of the combination of statistic models, knowledge of language, and semantic and pragmatic relations, as well as understanding how to map this knowledge into an acoustic description.

Since many of the speech technology applications will constitute some kind of man-machine dialogue, the study of dialogue models and their integration and control of the speech technology processing are important. Depending on the application, one of basically two models, "key-word spotting" or complete linguistic analysis of utterances, can be used. Both of these methods will be explored in CTT. Dialogue aspects such as degree of user initiative, confirmation, turn taking, and repair of misunderstandings will be studied within this subarea as well. The focus of the activities of CTT within this area will be the improvement of rule-based and statistical textual analysis as well as the development of speech-based dialogue models.

Important Research Topics for Stage 4

  • development and testing of data-driven solutions suitable for speech technology applications
  • improvement of a robust parser for spoken language, optimized for dialogue systems
  • further study and development of dialogue models as part of the integration work and in cooperation with other research centres.

Published by: TMH, Speech, Music and Hearing

Last updated: 2006-12-05