Contact




Svensk version

SAMPROS

Prosody in conversation

The project investigates how people talking to each other jointly decide who should speak when, and the role of prosody in making these joint decisions.

While prosody is by no means the only relevant information for this joint interaction control, we believe that it plays a central role (see e.g. Edlund & Heldner, 2005, and references mentioned therein). A detailed model of the prosody involved in interaction control is crucial both for producing appropriate conversational behaviour and for understanding human conversational behaviour. Both are required in order to reach our visionary goal, and represent the artificial conversational partner in the roles of speaker and listener in a conversation, respectively.

Group: Speech Communication and Technology

Staff:
Mattias Heldner (Project leader)
Joakim Gustafson
Gabriel Skantze
Jens Edlund
Anna Hjalmarsson
David House

Funding: RJ (Bank of Sweden Tercentenary Foundation) (P09-0064:1-E)

Duration: 2010 - 2012

Website: http://www.speech.kth.se/sampros

Related publications:

2013

Al Moubayed, S., Edlund, J., & Gustafson, J. (2013). Analysis of gaze and speech patterns in three-party quiz game interaction. In Interspeech 2013. Lyon, France. [abstract] [pdf]

Heldner, M., Hjalmarsson, A., & Edlund, J. (2013). Backchannel relevance spaces. In Asu, E. L., & Lippus, P. (Eds.), Nordic Prosody: Proceedings of the XIth Conference (pp. 137-146). Frankfurt am Main, Germany: Peter Lang. [pdf]

Skantze, G., Oertel, C., & Hjalmarsson, A. (2013). User feedback in human-robot interaction: Prosody, gaze and timing. In Proceedings of Interspeech. [abstract] [pdf]

2012

Edlund, J., Heldner, M., & Gustafson, J. (2012). On the effect of the acoustic environment on the accuracy of perception of speaker orientation from auditory cues alone. In Proc. of Interspeech 2012. Portland, Oregon, US. [abstract] [pdf]

Edlund, J., Heldner, M., & Gustafson, J. (2012). Who am I speaking at? - perceiving the head orientation of speakers from acoustic cues alone. In Proc. of LREC Workshop on Multimodal Corpora 2012. Istanbul, Turkey. [abstract] [pdf]

Laskowski, K. (2012). Exploiting Loudness Dynamics in Stochastic Models of Turn-Taking. In Proceedings of the 4th IEEE Workshop on Spoken Language Technology (SLT2012). Miami FL, USA. [abstract] [pdf]

Laskowski, K., Heldner, M., & Edlund, j. (2012). On the dynamics of overlap in multi-party conversation. In Proc. of Interspeech 2012. Portland, Oregon, US. [abstract] [pdf]

Oertel, C., Wlodarczak, M., Tarasov, A., Campbell, N., & Wagner, P. (2012). Context cues for classification of competitive and collaborative overlaps. In Speech Prosody 2012 (pp. 721-724). Shanghai, China. [abstract] [pdf]

2011

Edlund, J. (2011). In search of the conversational homunculus - serving to understand spoken human face-to-face interaction. Doctoral dissertation, KTH. [abstract] [pdf]

Heldner, M., Edlund, J., Hjalmarsson, A., & Laskowski, K. (2011). Very short utterances and timing in turn-taking. In Proceedings of Interspeech 2011 (pp. 2837-2840). Florence, Italy. [abstract] [pdf]

Hjalmarsson, A., & Laskowski, K. (2011). Measuring final lengthening for speaker-change prediction. In Proceedings of Interspeech 2011 (pp. 2069-2072). Florence, Italy. [abstract] [pdf]

Laskowski, K., Edlund, J., & Heldner, M. (2011). A single-port non-parametric model of turn-taking in multi-party conversation. In Proc. of ICASSP 2011 (pp. 5600-5603). Prague, Czech Republic. [abstract] [pdf]

Laskowski, K., Edlund, J., & Heldner, M. (2011). Incremental Learning and Forgetting in Stochastic Turn-Taking Models. In Proc. of Interspeech 2011 (pp. 2069-2072). Florence, Italy. [abstract] [pdf]

Laskowski, K., & Jin, Q. (2011). Harmonic structure transform for speaker recognition. In Proc. of Interspeech 2011 (pp. 365-368). Florence, Italy. [abstract] [pdf]

2010

Beskow, J., Edlund, J., Gustafson, J., Heldner, M., Hjalmarsson, A., & House, D. (2010). Modelling humanlike conversational behaviour. In Proceedings of SLTC 2010. Linköping, Sweden. [pdf]

Beskow, J., Edlund, J., Gustafson, J., Heldner, M., Hjalmarsson, A., & House, D. (2010). Research focus: Interactional aspects of spoken face-to-face communication. In Proc. of Fonetik 2010 (pp. 7-10). Lund, Sweden. [abstract] [pdf]

Heldner, M., Edlund, J., & Hirschberg, J. (2010). Pitch similarity in the vicinity of backchannels. In Proc. of Interspeech 2010 (pp. 3054-3057). Makuhari, Japan. [pdf]

Hjalmarsson, A. (2010). The vocal intensity of turn-initial cue phrases and filled pauses in dialogue. In Proceedings of SIGdial (pp. 225-228). Tokyo, Japan. [abstract] [pdf]

Laskowski, K. (2010). Modeling norms of turn-taking in multi-party conversation. In proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL2010) (pp. 999-1008). Uppsala, Sweden. [pdf]

Laskowski, K., Heldner, M., & Edlund, J. (2010). Preliminaries to an account of multi-party conversational turn-taking as an antiferromagnetic spin glass. In Proc. of NIPS Workshop on Modeling Human Communication Dynamics (pp. 46-49). Vancouver, B.C., Canada. [abstract] [pdf]







Published by: TMH, Speech, Music and Hearing
Webmaster, webmaster@speech.kth.se

Last updated: 2012-11-09