Joakim Gustafson

Joakim Gustafson

I am a professor, and head of the Division of Speech, Music and Hearing at KTH. I started working with spoken dialogue systems in 1992 as part of the Waxholm project. In 1998 I led the development of the August project, which had the aim of building a synthetic August Strindberg that members of the public could chat with. Between 1998 and 2002 I worked on the AdApt system, which was a multimodal spoken dialogue system for browsing apartments on sale in Stockholm. The topic of my PhD thesis was iterative development of multimodal dialogue systems.

Between 2000 and 2007 I had a position as a senior researcher at Telia Research. My main focus was on research projects, but I also acted as expert support in commercial launches of speech applications. In 2001 I worked in the T�nkOm project, where visitors at the telecom museum could enter an apartment of the future and interact with the animated agent Pixie. 2002-05 I led the development of Telia's part of the EU funded project NICE. The goal of this project was to build a speech enabled computer game where children could interact with animated 3D game characters. Here is an automatically generated presentation from my IVA 2005 powerpoint presentation, and here is an artificial podcast generated from some of the NICE research papers using Google NotebookLM

In my current position I am doing long-term research in design and development of multimodal systems and social robots, as well as interactional analysis of spontaneous spoken dialogue phenomena. I am currently heading the VR-funded project CONNECTED (context-aware speech synthesis for conversational AI) that develops a spontaneous speech synthesizer that uses breath and disfluencies as an implicit control of the manner of speaking. We also have a new project using our voices in devices for Augmentative and Alternative Communication. I am also co-PI in the VR funded project STANCE (Perception of speaker stance - using spontaneous speech synthesis to explore the contribution of prosody, context and speaker) and the RJ-funded project CAPTivating (Comparative Analysis of Public speaking with Text-to-speech). For examples of our Spontaneous speech goto the KTH TTS demo page)
In 2010 I started working on human-robot interaction. In the SRA ICT-TNG funded project SAVIR we collaborated with RPL to develop a robot that talked with humans in order to interpret visual scenes, see video. In the EU project IURO, we developed a mobile robot that could ask pedestrians in Munich for directions, see video. In this project the social robot FurHat was developed. I was recently the technical coordinator and work package leader in the EU project BabyRobot, where developed social robot applications for children. Our goal was to build a robot that is able to achieve joint-attention with the user, see video. I recently collaborated with RPL in two SSF-funded projects: FACT, a project on collaborative robots for manufacturing and EACare where we developed methods for detecting early signs of dementia. Finally, I have a Digital Futures funded collaborative project AAIS (Advanced Adaptive Intelligent Systems), where the aim is to develop social robots that can assist elderly in everyday tasks, e.g. giving sitation-dependent cooking directions,and a WASP funded project PerCorSo where the goal is to design the most appropriate way for a robot to behave in different human-crowded environments.
I currently have some project with the overarching theme AI for good. I have a WASP-funded project called RAPPORT (Real-time context-aware speech prosthesis for conversational interaction), where we will make use of our spontaneous TTS and large language models to improve AAC devices. I also have a PTS-funded project called EmpowerME, where the focus is to make use of large language models to help people with cognitive disabilites to understand and answer to mail from authorities. Finally, I have a Vinnova funded project, Food Talk, that aims to investigate how intelligent assistants could be used to promote sustainable cooking and behavior changes. In the project will will do multimodal data collections in our Intelligence Augmentation lab.

I am examiner and lecturer in the speech technology course. I am member of the Editorial Boards of the journal Speech Communication and the International Journal of Human-Computer Studies, and Review Editor on the Editorial Board of Human-Robot Interaction, specialty section of Frontiers in Robotics and AI. I was Technical Program co-Chair of Interspeech 2017 and spoken dialogue Area Chair of Interspeech 2015,2016,2018,2019.<< I am treasurer in the Board of ISCA (International Speech Communication Association).
List of Publications

Address:

Department of Speech, Music and Hearing, KTH

SE-100 44 Stockholm

Sweden

Visiting address:
Lindstedtsvägen 24
Tel +468790 8965 Fax -7854
Mob +46 70 206 08 17
Mail jkgu_AT_kth.se