Timing of intonation and gestures in spoken communication
The goal of the project is to understand timing relationships between intonation and gesture in spontaneous speech. This will be investigated through semi-automatic extraction of co-speech gestures from a large and varied dataset (audio, video, motion-capture), and analysis of function and synchronization of speech and gestures.
The melody of speech, or intonation, plays a crucial role in spoken interaction. By altering the speech melody, speakers can highlight important words and phrases making them prominent and more meaningful. Speakers also make use of changing melodies and rhythms to signal when it is time for the other speakers to talk (turntaking) as well as to give others feedback (such as mm or uhuh). The exact timing of melodies in speech is controlled with considerable precision by the speaker. These movements occur in particular places in relationship to syllables. Body and facial gestures regularly accompany the speech melody and often have the same function as intonation, but until now we have not been able to measure the timing of these gestures with the same precision as intonation. The aim of this research project is to measure with precision the timing relationship between the speech melodies and gestures using a large database of recorded conversations in Swedish. The participants have been recorded using high-quality audio and video and motion capture equipment in a specially designed studio. The results will have implications for our understanding of how speech and gestures are planned and coordinated in the brain, and will also enable better modeling of speech and gestures in such speech applications as robots and avatars.
David House (Project leader)
Funding: RJ (Bank of Sweden Tercentenary Foundation)
Duration: 2012-08 - 2017-01
Proceedings of the Workshop on Multimodal Analyses enabling Artificial Agents in Human-Machine Interaction (pp. 15-19). Tokyo, Japan. [pdf] (2016). Automatic Annotation of Gestural Units in Spontaneous Face-to-Face Interaction. In
Seventh Conference of the International Society for Gesture Studies (pp. 365). Paris. [pdf] (2016). Temporal domains of co-speech gestures and speech prosody. In
Proceedings of Speech Prosody 2016 (pp. 832-835). Boston, USA. [abstract] [pdf] (2016). Prosody and hand gesture at turn boundaries in Swedish. In
Proceedings of Fonetik 2015 (pp. 63-68). Lund University, Sweden. [abstract] [pdf] (2015). On the temporal domain of co-speech gestures: syllable, phrase or talk spurt?. In Lundmark Svensson, M., Ambrazaitis, G., & van de Weijer, J. (Eds.),
14th International Pragmatics Conference (pp. 454-455). Antwerp, Belgium. [pdf] (2015). Parallels between hand gestures and acoustic prosodic features in turn-taking. In
The Fifth Swedish Language Technology Conference. Uppsala, Sweden. [pdf] (2014). Automatic speech/non-speech classification using gestures in dialogue. In
Proc. of 12th International Conference on Auditory-Visual Speech Processing (AVSP2013). Annecy, France. [pdf] (2013). Aspects of co-occurring syllables and head nods in spontaneous dialogue. In
Proc. of Fonetik 2013 (pp. 1-4). Linköping University, Sweden. [pdf] (2013). Extracting and analysing co-speech head gestures from motion-capture data. In Eklund, R. (Ed.),
Proc. Tilburg Gesture Research Meeting. Tilburg University, The Netherlands. [pdf]
(2013). Extracting and analyzing head movements accompanying spontaneous dialogue. In