Professor & Head of Department
I am a professor and head of the Department of Speech, Music and Hearing at KTH Royal Institute of Technology, working on conversational AI, speech synthesis, social robotics, and multimodal human-computer interaction. I have built spoken dialogue systems since 1992 — from Waxholm, to August (a synthetic August Strindberg chatting with the public at Stockholm Cultural Centre for six months in 1998), to AdApt, a multimodal apartment-browsing system. My PhD thesis focused on iterative development of multimodal dialogue systems. From 2000 to 2007 I was a senior researcher at Telia Research, including leading KTH’s work on the EU project NICE — a speech-enabled computer game where children interacted with animated 3D characters.
Back at KTH from 2007, my research expanded into human-robot interaction. In SAVIR I collaborated with RPL to build a robot that interpreted visual scenes through dialogue. In the EU project IURO we built a mobile robot asking pedestrians for directions in Munich — work that led to the social robot Furhat. I was also technical coordinator on BabyRobot (EU), developing social robot applications for children. A parallel research strand is conversational spontaneous speech synthesis. I headed the VR project CONNECTED, which produced a TTS system controlled implicitly through breath and disfluencies, and was co-PI on STANCE and CAPTivating, using that TTS to study how prosody, fillers, and voice quality shape perceived speaker stance.
My current theme is AI for good. EmpowerME used LLMs to help people with cognitive disabilities handle official correspondence; FoodTalk explored voice-based cooking assistants. Two 2026 projects — SAInt and Kitchen Companion — continue that work, targeting in-home elderly assistance at the KTH Interaction and Robotics Lab. On the AAC front, RAPPORT and Personalized Voices form a joint programme on giving people back their voice.
Spontaneous speech synthesizers using breath, disfluencies, and prosody as implicit controls of speaking style. Context-aware TTS for dialogue systems. Turn-taking cues and voice quality in conversational AI.
Designing robots that engage in natural, multimodal conversation and function as social companions — from joint-attention robots for children to humanoid robots supporting domestic activities for the elderly.
Using LLMs, AI voices and humanoid robots to empower people with cognitive disabilities, improve AAC devices, help elderly users maintain independence, and support social connection through intelligent voice assistants.
Perceiving and Communicating Correct-by-design Socially Acceptable Autonomous Systems. Integrates perception, planning, and communication to develop robots that are both provably safe and perceived as trustworthy in shared human environments.
Promobilia — 2026–2031Situated Agentic Intelligence. Five humanoid robots (e.g. Unitree G1, Rainbow RB-Y1) will be equipped with multimodal conversational AI to assist with domestic tasks and serve as social companions for the elderly.
Kamprad Foundation — 2026–2029A proactive, voice-based AI assistant that helps older adults living alone cook nutritious meals and stay socially connected. Builds on findings from the Food Talk project.
WASP — 2024–2028Real-time context-aware speech prosthesis for conversational interaction. Combines spontaneous TTS and large language models to create more natural, expressive AAC devices for people who cannot speak.
VR — 2026–2030Adaptable speech synthesis for speech-impaired users. The goal is to develop personalised, adaptable voice models that reflect a user’s own natural voice identity.
◆ Completed Projects
Department of Speech, Music and Hearing
KTH Royal Institute of Technology
SE-100 44 Stockholm, Sweden
Lindstedtsvägen 24
Stockholm, Sweden