The speech group has a special interest in building computational models of human conversational behaviour that is evaluated in spoken dialogue systems. A central research area is intuitive spoken interaction with artefacts like robots (see IURO or SAVIR) and embodied conversations agents (see August, Adapt and MonAMI). Face-to-face interaction implies audiovisual communication, where visual signals from face and body movements are of great importance. The research also includes the development and testing of data-driven solutions suitable for speech technology applications together with studies of robust linguistic analysis for spoken language, optimized for dialogue systems.
Currently we focus on the following research issues in the area of multimodal spoken dialogue:
taking all available modalities into account (e.g. Edlund & Beskow, 2009; Skantze & Gustafson, 2009; Al Moubayed, Beskow & Granström, 2010)
modelling the conversational behaviour of all interlocutors and relationships formed between them (e.g. Edlund, Heldner & Hirschberg, 2009; Neiberg & Gustafson, 2010)
the special requirements on incremental speech technology in online conversational settings (e.g. Skantze & Schlangen, 2009; Skantze & Hjalmarsson, 2010).
prosody in conversation, including turn taking cues (e.g. SamPros; SamRytm)
generation of human-like conversational behaviour, including synthesis of conversational grunts (e.g. SamSynt; GenDial)
Published by: TMH, Speech, Music and Hearing
Webmaster, webmaster@speech.kth.se
Last updated: Friday, 28-Oct-2011 15:30:17 MEST