Gabriel Skantze
Associate Professor in Speech Technology
Department of Speech Music and Hearing
School of Computer Science and Communication
KTH Royal Institute of Technology

Young Academy of Sweden

Publications

My Google Scholar page

Journal articles

Skantze, G. (2016). Real-time Coordination in Human-robot Interaction using Face and Voice. AI Magazine, 37(4), 19-31. [abstract] [pdf]

Cuayáhuitl, H., Kazunori, K., & Skantze, G. (2015). Introduction for Speech and language for interactive robots. Computer Speech and Language, 34(1), 83-86.

Meena, R., Skantze, G., & Gustafson, J. (2014). Data-driven Models for timing feedback responses in a Map Task dialogue system. Computer Speech and Language, 28(4), 903-922. [abstract] [pdf]

Skantze, G., Hjalmarsson, A., & Oertel, C. (2014). Turn-taking, Feedback and Joint Attention in Situated Human-Robot Interaction. Speech Communication, 65, 50-66. [abstract] [pdf]

Mirnig, N., Weiss, A., Skantze, G., Al Moubayed, S., Gustafson, J., Beskow, J., Granström, B., & Tscheligi, M. (2013). Face-to-Face with a Robot: What do we actually talk about?. International Journal of Humanoid Robotics, 10(1). [abstract] [link]

Al Moubayed, S., Skantze, G., & Beskow, J. (2013). The Furhat Back-Projected Humanoid Head - Lip reading, Gaze and Multiparty Interaction. International Journal of Humanoid Robotics, 10(1). [abstract] [pdf]

Skantze, G., & Hjalmarsson, A. (2013). Towards Incremental Speech Generation in Conversational Systems. Computer Speech & Language, 27(1), 243-262. [abstract] [pdf]

Schlangen, D., & Skantze, G. (2011). A General, Abstract Model of Incremental Dialogue Processing. Dialogue & Discourse, 2(1), 83-111. [pdf]

Skantze, G. (2005). Exploring human error recovery strategies: implications for spoken dialogue systems. Speech Communication, 45(3), 325-341. [abstract] [pdf]

Peer-reviewed conference papers

Skantze, G. (2017). Predicting and Regulating Participation Equality in Human-robot Conversations: Effects of Age and Gender. In Conference on Human-Robot Interaction (HRI2017). Vienna, Austria. (*) [pdf]

(*) Nominated for Best Paper Award

Skantze, G. (2017). Towards a General, Continuous Model of Turn-taking in Spoken Dialogue using LSTM Recurrent Neural Networks. In Proceedings of SigDial. Saarbrucken, Germany. [pdf]

Johansson, M., Hori, T., Skantze, G., Höthker, A., & Gustafson, J. (2016). Making Turn-taking Decisions for an Active Listening Robot for Memory Training. In Proceedings of International Conference on Social Robotics (ICSR). Kansas City, MO. (*) [pdf]

(*) Best Paper Award at ICSR 2016

Georgiladakis, S., Athanasopoulou, G., Meena, R., Lopes, J., Chorianopoulou, A., Palogiannidi, E., Iosif, E., Skantze, G., & Potamianos, A. (2016). Root Cause Analysis of Miscommunication Hotspots in Spoken Dialogue Systems. In Interspeech 2016 (pp. 1156--1160). San Francisco, CA, USA. [pdf]

Skantze, G., Johansson, M., & Beskow, J. (2015). A Collaborative Human-robot Game as a Test-bed for Modelling Multi-party, Situated Interaction. In Proceedings of IVA. Delft, Netherlands. [abstract] [pdf]

Meena, R., Lopes, J., Skantze, G., & Gustafson, J. (2015). Automatic Detection of Miscommunication in Spoken Dialogue Systems. In Proceedings of 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL) (pp. 354-363). Prague, Czech Republic. [abstract] [pdf]

Lopes, J., Salvi, G., Skantze, G., Abad, A., Gustafson, J., Batista, F., Meena, R., & Trancoso, I. (2015). Detecting Repetitions in Spoken Dialogue Systems Using Phonetic Distances. In Proceedings of Interspeech 2015 (pp. 1805-1809). Dresden, Germany. [abstract] [pdf]

Skantze, G., Johansson, M., & Beskow, J. (2015). Exploring Turn-taking Cues in Multi-party Human-robot Discussions about Objects. In Proceedings of ICMI. Seattle, Washington, USA. (*) [pdf]

(*) Nominated for Outstanding Paper Award

Skantze, G., & Johansson, M. (2015). Modelling situated human-robot interaction using IrisTK. In Proceedings of SIGDIAL. Prague, Czech Republic. [abstract]

Johansson, M., & Skantze, G. (2015). Opportunities and Obligations to Take Turns in Collaborative Multi-Party Human-Robot Interaction. In Proceedings of SIGDIAL. Prague, Czech Republic. [abstract] [pdf]

Johansson, M., Skantze, G., & Gustafson, J. (2014). Comparison of human-human and human-robot Turn-taking Behaviour in multi-party Situated interaction. In Proceedings of the International Workshop on Understanding and Modeling Multiparty, Multimodal Interactions, at ICMI 2014. Istanbul, Turkey. [pdf]

Meena, R., Boye, J., Skantze, G., & Gustafson, J. (2014). Crowdsourcing Street-level Geographic Information Using a Spoken Dialogue System. In Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL) (pp. 2-11). Philadelphia, PA, USA. [abstract] [pdf]

Al Moubayed, S., Beskow, J., Bollepalli, B., Gustafson, J., Hussen-Abdelaziz, A., Johansson, M., Koutsombogera, M., Lopes, J., Novikova, J., Oertel, C., Skantze, G., Stefanov, K., & Varol, G. (2014). Human-robot collaborative tutoring using multiparty multimodal spoken dialogue. In Proc. of HRI'14. Bielefeld, Germany. [abstract] [pdf]

Al Moubayed, S., Beskow, J., & Skantze, G. (2014). Spontaneous spoken dialogues with the Furhat human-like robot head. In HRI'14. Bielefeld, Germany. [abstract] [pdf]

Koutsombogera, M., Al Moubayed, S., Beskow, J., Bollepalli, B., Gustafson, J., Hussen-Abdelaziz, A., Johansson, M., Lopes, J., Novikova, J., Oertel, C., Skantze, G., Stefanov, K., & Varol, G. (2014). The Tutorbot Corpus – A Corpus for Studying Tutoring Behaviour in Multiparty Face-to-Face Spoken Dialogue. In Proc. of LREC'14. Reykjavik, Iceland. [abstract]

Skantze, G., Anna, H., & Oertel, C. (2014). User Feedback in Human-Robot Dialogue: Task Progression and Uncertainty. In Proceedings of the HRI Workshop on Timing in Human-Robot Interaction. Bielefeld, Germany.

Meena, R., Boye, J., Skantze, G., & Gustafson, J. (2014). Using a Spoken Dialogue System for Crowdsourcing Street-level Geographic Information. In the 2nd Workshop on Action, Perception and Language, SLTC 2014. Uppsala, Sweden. [abstract] [pdf]

Meena, R., Skantze, G., & Gustafson, J. (2013). A Data-driven Model for Timing Feedback in a Map Task Dialogue System. In 14th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL) (pp. 375-383). Metz, France. (*) [abstract] [pdf]

(*) Nominated for Best Paper Award at SIGdial 2013

Skantze, G., Hjalmarsson, A., & Oertel, C. (2013). Exploring the effects of gaze and pauses in situated human-robot interaction. In 14th Annual Meeting of the Special Interest Group on Discourse and Dialogue - SIGDial. Metz, France. (*) [abstract] [pdf]

(*) Nominated for Best Paper Award at SIGdial 2013

Johansson, M., Skantze, G., & Gustafson, J. (2013). Head Pose Patterns in Multiparty Human-Robot Team-Building Interactions. In International Conference on Social Robotics - ICSR 2013. Bristol, UK. [abstract] [pdf]

Meena, R., Skantze, G., & Gustafson, J. (2013). Human Evaluation of Conceptual Route Graphs for Interpreting Spoken Route Descriptions. In Proceedings of IWCS 2013 Workshop on Computational Models of Spatial Language Interpretation and Generation (CoSLI-3) (pp. 13-18). Potsdam, Germany: Association for Computational Linguistics. [abstract] [pdf]

Meena, R., Skantze, G., & Gustafson, J. (2013). The Map Task Dialogue System: A Test-bed for Modelling Human-Like Dialogue. In 14th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL) (pp. 366-368). Metz, France. [abstract] [pdf]

Skantze, G., Oertel, C., & Hjalmarsson, A. (2013). User feedback in human-robot interaction: Prosody, gaze and timing. In Proceedings of Interspeech. [abstract] [pdf]

Meena, R., Skantze, G., & Gustafson, J. (2012). A Chunking Parser for Semantic Interpretation of Spoken Route Directions in Human-Robot Dialogue. In Proceedings of the 4th Swedish Language Technology Conference (SLTC 2012) (pp. 55-56). Lund, Sweden. [abstract] [pdf]

Meena, R., Skantze, G., & Gustafson, J. (2012). A Data-driven Approach to Understanding Spoken Route Directions in Human-Robot Dialogue. In INTERSPEECH-2012 (pp. 226-229). Portland, OR, USA. [abstract] [pdf]

Skantze, G. (2012). A Testbed for Examining the Timing of Feedback using a Map Task. In Proceedings of the Interdisciplinary Workshop on Feedback Behaviors in Dialog. Portland, OR. (*) [abstract] [pdf]

(*) Selected for keynote presentation

Blomberg, M., Skantze, G., Al Moubayed, S., Gustafson, J., Beskow, J., & Granström, B. (2012). Children and adults in dialogue with the robot head Furhat - corpus collection and initial analysis. In Proceedings of WOCCI. Portland, OR. [pdf]

Skantze, G., Al Moubayed, S., Gustafson, J., Beskow, J., & Granström, B. (2012). Furhat at Robotville: A Robot Head Harvesting the Thoughts of the Public through Multi-party Dialogue. In Proceedings of IVA-RCVA. Santa Cruz, CA. [pdf]

Al Moubayed, S., Beskow, J., Granström, B., Gustafson, J., Mirning, N., Skantze, G., & Tscheligi, M. (2012). Furhat goes to Robotville: a large-scale multiparty human-robot interaction data collection in a public space. In Proc of LREC Workshop on Multimodal Corpora. Istanbul, Turkey. [pdf]

Skantze, G., & Al Moubayed, S. (2012). IrisTK: a statechart-based toolkit for multi-party face-to-face interaction. In Proceedings of ICMI. Santa Monica, CA. [pdf]

Al Moubayed, S., Skantze, G., & Beskow, J. (2012). Lip-reading Furhat: Audio Visual Intelligibility of a Back Projected Animated Face. In Proc. of the Intelligent Virtual Agents 10th International Conference (IVA 2012). Santa Cruz, CA, USA: Springer. [abstract] [pdf]

Al Moubayed, S., Skantze, G., Beskow, J., Stefanov, K., & Gustafson, J. (2012). Multimodal Multiparty Social Interaction with the Furhat Head. In Proc. of the 14th ACM International Conference on Multimodal Interaction ICMI. Santa Monica, CA, USA. (*) [abstract] [pdf]

(*) Outstanding Demo Award at ICMI 2012

Al Moubayed, S., & Skantze, G. (2012). Perception of Gaze Direction for Situated Interaction. In Proc. of the 4th Workshop on Eye Gaze in Intelligent Human Machine Interaction. The 14th ACM International Conference on Multimodal Interaction ICMI. Santa Monica, CA, USA. [abstract] [pdf]

Al Moubayed, S., & Skantze, G. (2011). Effects of 2D and 3D Displays on Turn-taking Behavior in Multiparty Human-Computer Dialog. In Proceedings of SemDial (pp. 192-193). Los Angeles, CA. [pdf]

Johnson-Roberson, M., Bohg, J., Skantze, G., Gustafson, J., Carlson, R., Rasolzadeh, B., & Kragic, D. (2011). Enhanced Visual Scene Understanding through Human-Robot Dialog. In IEEE/RSJ International Conference on Intelligent Robots and Systems. [pdf]

Al Moubayed, S., & Skantze, G. (2011). Turn-taking Control Using Gaze in Multiparty Human-Computer Dialogue: Effects of 2D and 3D Displays. In Proceedings of AVSP. Florence, Italy. [pdf]

Johansson, M., Skantze, G., & Gustafson, J. (2011). Understanding route directions in human-robot dialogue. In Proceedings of SemDial (pp. 19-27). Los Angeles, CA. [pdf]

Johnson-Roberson, M., Bohg, J., Kragic, D., Skantze, G., Gustafson, J., & Carlson, R. (2010). Enhanced Visual Scene Understanding through Human-Robot Dialog. In Proceedings of AAAI 2010 Fall Symposium: Dialog with Robots. Arlington, VA. [pdf]

Schlangen, D., Baumann, T., Buschmeier, H., Buss, O., Kopp, S., Skantze, G., & Yaghoubzadeh, R. (2010). Middleware for Incremental Processing in Conversational Agents. In Proceedings of SigDial. Tokyo, Japan. [pdf]

Skantze, G., & Hjalmarsson, A. (2010). Towards Incremental Speech Generation in Dialogue Systems. In Proceedings of SIGdial (pp. 1-8). Tokyo, Japan. (*) [abstract] [pdf]

(*) Best Paper Award at SIGdial 2010

Schlangen, D., & Skantze, G. (2009). A general, abstract model of incremental dialogue processing. In Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL-09). Athens, Greece. [abstract] [pdf]

Skantze, G., & Gustafson, J. (2009). Attention and interaction control in a human-human-computer dialogue setting. In Proceedings of SigDial 2009. London, UK. [abstract] [pdf]

Skantze, G., & Schlangen, D. (2009). Incremental dialogue processing in a micro-domain. In Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL-09). Athens, Greece. [abstract] [pdf]

Skantze, G., & Gustafson, J. (2009). Multimodal interaction control in the MonAMI Reminder. In Proceedings of DiaHolmia (pp. 127-128). Stockholm, Sweden. [pdf]

Beskow, J., Edlund, J., Granström, B., Gustafson, J., Skantze, G., & Tobiasson, H. (2009). The MonAMI Reminder: a spoken dialogue system for face-to-face interaction. In Interspeech 2009. Brighton, U.K. [abstract] [pdf]

Beskow, J., Edlund, J., Granström, B., Gustafson, J., & Skantze, G. (2008). Innovative interfaces in MonAMI: the KTH Reminder. In Perception in Multimodal Dialogue Systems - Proceedings of the 4th IEEE Tutorial and Research Workshop on Perception and Interactive Technologies for Speech-Based Systems, PIT 2008, Kloster Irsee, Germany, June 16-18, 2008. (pp. 272-275). Berlin/Heidelberg: Springer. [abstract] [pdf]

Beskow, J., Edlund, J., Gjermani, T., Granström, B., Gustafson, J., Jonsson, O., Skantze, G., & Tobiasson, H. (2008). Innovative interfaces in MonAMI: the reminder. In Proceedings of the 10th international conference on Multimodal interfaces, Chania, Crete, Greece (pp. 199-200). New York, NY, USA: ACM. [abstract] [pdf]

Beskow, J., Edlund, J., Granström, B., Gustafson, J., Jonsson, O., Skantze, G., & Tobiasson, H. (2008). The MonAMI Reminder system. In Proc. of SLTC 2008 (pp. 13-14). Stockholm. [abstract] [pdf]

Skantze, G. (2007). Making grounding decisions: Data-driven estimation of dialogue costs and confidence thresholds. In Proceedings of SigDial (pp. 206-210). Antwerp, Belgium. [abstract] [pdf]

Skantze, G., Edlund, J., & Carlson, R. (2006). Talking with Higgins: Research challenges in a spoken dialogue system. In André, E., Dybkjaer, L., Minker, W., Neumann, H., & Weber, M. (Eds.), Perception and Interactive Technologies (pp. 193-196). Berlin/Heidelberg: Springer. [abstract] [pdf]

Wallers, Å., Edlund, J., & Skantze, G. (2006). The effects of prosodic features on the interpretation of synthesised backchannels. In André, E., Dybkjaer, L., Minker, W., Neumann, H., & Weber, M. (Eds.), Proceedings of Perception and Interactive Technologies (pp. 183-187). Springer. [abstract] [pdf]

Skantze, G., House, D., & Edlund, J. (2006). User responses to prosodic variation in fragmentary grounding utterances in dialogue. In Proceedings of Interspeech 2006 - ICSLP (pp. 2002-2005). Pittsburgh PA, USA. [abstract] [pdf]

Skantze, G. (2005). Galatea: a discourse modeller supporting concept-level error handling in spoken dialogue systems. In Proceedings of SigDial (pp. 178-189). Lisbon, Portugal. [abstract] [pdf]

Edlund, J., House, D., & Skantze, G. (2005). The effects of prosodic features on the interpretation of clarification ellipses. In Proceedings of Interspeech 2005 (pp. 2389-2392). Lisbon, Portugal. [abstract] [pdf]

Skantze, G., & Edlund, J. (2004). Early error detection on word level. In Proceedings of ISCA Tutorial and Research Workshop (ITRW) on Robustness Issues in Conversational Interaction. Norwich, UK. [abstract] [pdf]

Edlund, J., Skantze, G., & Carlson, R. (2004). Higgins - a spoken dialogue system for investigating error handling techniques. In Proceedings of the International Conference on Spoken Language Processing, ICSLP 04 (pp. 229-231). Jeju, Korea. [abstract] [pdf]

Skantze, G., & Edlund, J. (2004). Robust interpretation in the Higgins spoken dialogue system. In Proceedings of ISCA Tutorial and Research Workshop (ITRW) on Robustness Issues in Conversational Interaction. Norwich, UK. [abstract] [pdf]

Skantze, G. (2003). Exploring human error handling strategies: implications for spoken dialogue systems. In Proceedings of ISCA Tutorial and Research Workshop on Error Handling in Spoken Dialogue Systems (pp. 71-76). Chateau-d'Oex-Vaud, Switzerland. [pdf]

Skantze, G. (2002). Coordination of referring expressions in multimodal human-computer dialogue. In Proceedings of ICSLP 2002 (pp. 553-556). Denver, Colorado, USA. [abstract] [pdf]

Book chapters

Al Moubayed, S., Beskow, J., Bollepalli, B., Hussen-Abdelaziz, A., Johansson, M., Koutsombogera, M., Lopes, J., Novikova, J., Oertel, C., Skantze, G., Stefanov, K., & Varol, G. (2014). Tutoring Robots: Multiparty multimodal social dialogue with an embodied tutor. In Proceedings of eNTERFACE2013. Springer. [abstract] [pdf]

Al Moubayed, S., Beskow, J., Skantze, G., & Granström, B. (2012). Furhat: A Back-projected Human-like Robot Head for Multiparty Human-Machine Interaction. In Esposito, A., Esposito, A., Vinciarelli, A., Hoffmann, R., & C. Müller, V. (Eds.), Cognitive Behavioural Systems. Lecture Notes in Computer Science (pp. 114-130). Springer.

Beskow, J., Carlson, R., Edlund, J., Granström, B., Heldner, M., Hjalmarsson, A., & Skantze, G. (2009). Multimodal Interaction Control. In Waibel, A., & Stiefelhagen, R. (Eds.), Computers in the Human Interaction Loop (pp. 143-158). Berlin/Heidelberg: Springer. [pdf]

Skantze, G. (2008). Galatea: A discourse modeller supporting concept-level error handling in spoken dialogue systems. In Dybkjær, L., & Minker, W. (Eds.), Recent Trends in Discourse and Dialogue. Springer. [pdf]

Non-reviewed

Al Moubayed, S., Beskow, J., & Skantze, G. (2013). The Furhat Social Companion Talking Head. In Interspeech 2013 - Show and Tell. Lyon, France. [abstract] [pdf]

Al Moubayed, S., Beskow, J., Blomberg, M., Granström, B., Gustafson, J., Mirnig, N., & Skantze, G. (2012). Talking with Furhat - multi-party interaction with a back-projected robot head. In Proceedings of Fonetik'12. Gothenberg, Sweden. [abstract] [pdf]

Skantze, G. (2010). Jindigo: a Java-based Framework for Incremental Dialogue Systems. Technical Report, KTH, Stockholm, Sweden. [pdf]

Beskow, J., Edlund, J., Granström, B., Gustafson, J., Jonsson, O., & Skantze, G. (2008). Speech technology in the European project MonAMI. In Proceedings of FONETIK 2008 (pp. 33-36). Gothenburg, Sweden. [abstract] [pdf]

Skantze, G., House, D., & Edlund, J. (2006). Grounding and prosody in dialog. In Working Papers 52: Proceedings of Fonetik 2006 (pp. 117-120). Lund, Sweden: Lund University, Centre for Languages & Literature, Dept. of Linguistics & Phonetics. [abstract] [pdf]

Carlson, R., Edlund, J., Heldner, M., Hjalmarsson, A., House, D., & Skantze, G. (2006). Towards human-like behaviour in spoken dialog systems. In Proceedings of Swedish Language Technology Conference (SLTC 2006). Gothenburg, Sweden. [pdf]

Edlund, J., House, D., & Skantze, G. (2005). Prosodic Features in the Perception of Clarification Ellipses. In Proceedings of Fonetik 2005. Gothenburg, Sweden. [abstract] [pdf]