Previous projects

Adapt - A multi-modal dialogue system for finding apartments in Stockholm

The AdApt project had as its goal to be the foundation for the development and evaluation of advanced multimodala spoken dialogue systems. [more]

AGNULA - A GNU/Linux Audio distribution
Funding: EU, KTH


ASPI - Audiovisual to Articulatory Speech Inversion
Funding: EU


August -

The August system was a conversational spoken dialogue system featuring an animated agent called August, whose persona was inspired by August Strindberg, the famous Swedish 19th century author. [more]

BabyRobot - Child-Robot Communication and Collaboration
Funding: EU

The main goal of the project is to create robots that analyze and track human behavior over time in the context of their surroundings using audio-visual monitoring in order to establish common ground and intention-reading capabilities. [more]

BioASU - Biologically inspired statistical methods for flexible automatic speech understanding
Funding: VR

The project will develop machine learning methods for speech understanding that more closely resemble the biological approach to learning. [more]

BLARK/SNK - An Infrastructure for Swedish language technology
Funding: VR

Research and development on language technology needs an infrastructure of publicly available and standardized basic resources. [more]

Bostran - The bowed string - bow motion analysis and influence of bow properties
Funding: VR

The basic physical process for the sound generation in the bowed instruments is a stick-slip motion triggered by the wave motion on the string. The project comprises three sub-studies related to the modelling of the bow-string interaction and its control mechanisms:

  1. A validation of existing bowed-string models using a PC-controlled bowing... [more]

BrainTuning - Tuning the brain for music
Funding: EU

BrainTuning is about music in the brain. The project makes use of the most recent neuroscience techniques which will provide new ways to investigate how music and music-induced emotions are processed in the brain. [more]

BUG - Children's voice development and gender differences
Funding: VR

The project investigates some developmental aspects of children's voices. [more]

CALATEA - Computer-Animated LAnguage TEAchers
Funding: VR

The key contribution of CALATEA is to use computer-animated virtual teachers and speech technology in combination toimprove self-studies in second language learning. [more]

CALST - Computer-Assisted Listening and Speaking Tutor
Funding: NTNU + Norgesuniversitetet

The project aims at developing a computer program that will be used to train Norwegian as a second language. [more]

CAVE - Caller Verification
Funding: EU

The CAVE project focussed on the development and testing of speaker verification systems for use in telephone applications like calling-card or financial services. [more]

Centlex - Centlex

CentLex is the central lexicon resource produced and maintained by the Centre for Speech Technology. [more]

CHIL - Computers in the Human Interaction Loop
Funding: EU

The goal of the CHIL project is to make everybody’s daily life easier. [more]

CLARIN - Common Language Resources and Technology Infrastructure
Funding: EU

The CLARIN project is a large-scale pan-European collaborative effort to create, coordinate and make language resources and technology available and readily useable. [more]

ConGAS - Gesture controlled audio systems (COST 287)
Funding: EU

The Cost287-ConGAS Action intends to contribute to the advancement and to the development of musical gesture data analysis and to capture aspects connected to the control of digital sound and music processing. Cost287-ConGAS is a COST-TIST action. [more]

CORDIAL - Coordination of Attention and Turn-taking in Situated Interaction
Funding: VR


COST 278 - Spoken language interaction in telecommunication
Funding: EU


COST2103 - Advanced Voice Function Assessment
Funding: EU, KTH

The main objective of the Action is to combine previously unexploited techniques with new theoretical developments to improve the assessment of voice for as many European languages as possible, while acquiring in parallel data with a view to elaborating better voice production models.

Progress in the clinical assessment and enhancement of... [more]

DEAL - Role-playing and Dialogue System for Second Language Learners

DEAL is a multidisciplinary research platform where challenges and potential benefits of combining elements from computer games, dialogue systems and language learning can be explored. [more]

DUMAS - Dynamic Universal Mobility for Adaptive Speech Interfaces
Funding: EU

The DUMAS (Dynamic Universal Mobility for Adaptive Speech Interfaces) project develops multi-lingual speech-based applications, focusing on intelligent and ambient interaction management with a special emphasis on naturalness that takes into account the user's personalised needs. Future communication with mobile electronic systems requires dynamic and adaptive capabilities: systems that can learn... [more]

ENABL - ENAbler for computer-Based vocational tasks with Language and speech
Funding: EU

One main objective of the ENABL project was to provide access by voice, via speech recognition, to an engineering design system, ICAD. Since persons with manual disabilities also often have a vocal disability, a part of the ENABL project was dedicated to evaluating the performance of dysarthric speech on a speech recognition engine.... [more]

Error - Error and Miscommunication in Human-Computer Dialogue Systems
Funding: VR

The aims of the current project are to analyze miscommunication and to investigate means of improving error handling in spoken and multimodal dialogue systems. Generally, the goal is to attain a greater understanding of the mechanisms and structures underlying problematic human-interaction. By studying users’ linguistic behavior as they interact with... [more]

EUNISON - Extensive UNIfied-domain SimulatiON of the human voice
Funding: EU, KTH

This project seeks to build physics-based simulations of the human voice that are more detailed and more complete than before, using numerical models that have been validated against mechanical replicas. The simulations will be accessible for control in the mechanical, neuromotor and phonetic domains. The project engages seven research groups... [more]

FonaDyn - Phonatory Dynamics and States
Funding: VR

The voice has several non-linear and context-dependent mechanisms that can give rise to distinct phonatory states. We submit that much of the observed variability in objective voice metrics results from the influence of such states, and will attempt to account for some of them, using a state-based analysis paradigm. [more]

GenDial - Modelling utterance generation in conversational dialogue systems
Funding: VR

The GenDial project spans research in several disciplines including computer science, human-human and human-machine spoken interaction and language technology. Our long term research goal is to develop a new class of conversational spoken dialog systems which to a large extent follows the principles of human-human interaction. A natural part of human conversation... [more]

GetHomeSafe - Extended Multimodal Search and Communication Systems for Safe In-Car Application
Funding: EU

The aim of the proposed project is to develop a system for safe information access and communication while driving. [more]

Ghost in the Cave -

An expressive game with two teams navigating in a 3D shared virtual environment using either voice or body movements as input (part of the MEGA project) [more]

GROG - Breaks and grouping - structuring of speech in different communicative environments
Funding: VR

The project is a cooperation between Umeå University and KTH. The project is headed by Eva Strangert Umeå. The purpose of this project is to model the structuring of speech in terms of prosodic boundaries and groupings in various communicative situations. The modeling aims at a structured and optimized description of... [more]

GULAN - A System for Teaching Spoken Dialogue Systems Technology

The aim of this work has been to put a fully functioning spoken dialogue system into the hands of the students as an instructional aid. [more]

HaH - Hearing at home
Funding: EU

The Hearing at Home (HaH) project researches and develops next generation assistive devices that allow the growing number of hearing impaired – which predominantly includes the elderly – equal participation in communication. The project aims to empower the hearing disabled to play their own role in the information society. [more]

Hi-Fi Voice - Hi-Fi Voice: the fine structure of the voice signal at high frequencies
Funding: VR

The human voice has weak but perceptually important frequency components at frequencies above 4-5 kHz. This frequency range is studied from production and perception perspectives. [more]

Higgins - Higgins

Higgins is a framework project developing modules and techniques for flexible, robust and human-like dialogue system design. [more]

HUMAINE - Human-machine interaction network on emotion
Funding: EU

HUMAINE aims to lay the foundations for European development of systems that can register, model and/or influence human emotional and emotion-related states and processes - 'emotion-oriented systems'. [more]

IGLU - Interactive Grounded Language Understanding

Language is an ability that develops in young children through joint interaction with their caretakers and their physical environment. At this level, human language understanding could be referred as interpreting and expressing semantic concepts (e.g. objects, actions and relations) through what can be perceived (or inferred) from current context in the environment. Previous work... [more]

InkSynt - Incremental Text-To-Speech Conversion
Funding: VR

We will develop an incremental text-to-speech converter (TTS), which can be used in dynamically changing situations. In the project we will collect speech databases of how people read incrementally displayed text aloud, which will serve as the basis for the development of methods for incremental TTS with the correct prosody. We will... [more]

ISHT - Interior sound design of high-speed trains
Funding: KK-stiftelsen (The Knowledge Foundation)

The main research question in this project is: "How can we develop design methods and acoustic artefacts in order to improve the sound environment of the high-speed trains of the future?" [more]

IURO - Interactive Urban Robot
Funding: EU

The goal of IURO project is to develop a robot that can engage in information-gathering face-to-face interactions in multi-user settings. [more]

Jindigo - Java-based Incremental Dialog Framework
Funding: CSC, Ung fakultet

Jindigo is a framework for developing and experimenting with incremental spoken dialog systems, developed at the Department of Speech Music and Hearing, KTH. [more]

Kobra - Knowledge-rich speaker adaptation for speech recognition
Funding: VR


Linguistic Environment - Exchange students' linguistic environment at Scandinavian universities
Funding: Nordic Council of Ministers

Exchange students who come to Scandinavia are often motivated by an intention to improve their proficiency in English rather than the local language. They take classes in English and may find themselves living in a lingua-franca English bubble, acculturated to an international-student subculture. A few do break out of the... [more]

Lipread - Lipreading to support Foreign and Second Language Learning by Hearing Impaired and Normally Hearing Persons
Funding: EU

The objective of the project is to develop a syllabus and an e-learning environment to teach lipreading of foreign and second languages. Our target groups: deaf and hard of hearing adults, their therapists and teachers, and normally hearing language learners. [more]

Ljudparken/The Soundpark - Using modern smartphones to create interactive listening experiences for hearing impaired
Funding: PTS - Post och Telestyrelsen

The aim of the project is to create interactive listening experiences for persons with hearing impairments through smartphone applications. Interactive listening applications will provide better conditions for partaking in commonly available audio-based entertainment, but also offer new possibilities for active heartraining. [more]

Ljudskrapan -
Funding: SPM

The project aims at investigating the methods for sound stimulation that are being used to help hearing impaired and deaf persons at an early stage of intellectual development. Further, new methods for allowing these individuals to explore the sound modality will be developed and tested. We will have a close... [more]

Ljudskrapan - The Soundscraper
Funding: Promobilia Foundation

Ljudskrapan/The Soundscraper was initiated to help deaf children with additional sensory and cognitive function disorders to improve their possibilities for communication in order to facilitate emotional and cognitive development. The idea is to allow playful interaction with sound even with limited physical movements, and let sound provide an immediate sensory... [more]

MASSIVE - Large-scale massively multimodal modelling of non-verbal behaviour in spontaneous dialogue
Funding: VR

The aim is to provide a large-scale kinematic database based on motion capture of human conversational behaviour, as well as to build statistical models of multimodal non-verbal behaviour in dialogue. [more]

MonAMI - Mainstreaming on ambient intelligence
Funding: EU

The overall objective of MonAMI is to mainstream accessibility in consumer goods and services, including public services, through applied research and development, using advanced technologies to ensure equal access, independent living and participation for all in the Information Society. [more]

MULTISENSE - Multimodal and multisensory interfaces for interaction with muscolo-skeletal models
Funding: EU

The project concerned the visualization of, and interaction with, data related to musculo-skeletal structures by means of multi-modal and multi-sensorial interfaces. [more]

MUMI - Multi-agent Multimodal Interaction: Spatial attention in haptic and auditory simulations
Funding: VR

Little is known about the role that haptic and auditory feedback can have in collaborative situations and in mediated communication between people. Multi-agent multimodal interaction systems that provide advanced interaction possibilities for manufacturing, for medical applications or for teaching will fundamentally alter the way people work together in the future.... [more]

MUSCLE - Multimedia Understanding through Semantics, Computation and Learning
Funding: EU

MUSCLE is an EC-sponsored Network of Excellence that aims at establishing and fostering closer collaboration between research groups in multimedia datamining and machine learning. The Network integrates the expertise of over forty research groups working on image and video processing, speech and text analysis, statistics and machine learning. The goal... [more]


The ultimate aim of the Olga project is to build an easy to use dialogue interface to a database for the general public. The user interacts with a talking animated agent Olga, visible on a display, either with text, speech or a pointing device. The demonstrator application is a database... [more]

OLP - Ortho-Logo-Paedia
Funding: EU

The OLP (Ortho-Logo-Paedia) project aims at improving the quality of life of persons with articulatory impairments by applying a novel technological aid to speech therapy, by integrating this training with speech recognition technology and by making these facilities available over the Internet. [more]

PAMIR - Computational Modelling of Perceptual Music Features
Funding: VR

Today, computers and the Internet are commonly used for all aspects of music culture from production to listening. When a typical online music database contains 5 million songs, new computer tools are needed for characterizing, and indexing music audio. This is the focus of the new research field Music... [more]

ParSys - Data-driven Parametric Speech Synthesis

The ParSyn (Parametric Synthesis) project is focused on two challenges: 1) speech synthesis based on both datadriven and rule-based parametric modeling; 2) speech synthesis of spontaneous speech including disfluencies. [more]

PF-Star - Preparing future multisensorial interaction research
Funding: EU, KTH

PF-STAR has contributed to establish future activities in the field of Multisensorial and Multilingual communication (Interface Technologies) on firmer bases by providing technological baselines, comparative evaluations, and assessment of prospects of core technologies, which future research and development efforts can build from. The project has addressed three crucial areas: technologies for... [more]

PICASSO - Pioneering Caller Authentication for Secure Service Operation
Funding: EU

The PICASSO project developed and tested secure telematics transaction services using caller authentication by voice. Specch recognition and speaker verification/identification technologies were integrated to provide interfaces that are at once intuitive, easy to use and secure against intruders. PICASSO built on the results of the CAVE project. New telephone services... [more]

RealsimPLE - Combining physical reality with simulations in pedagogical laboratory experiments
Funding: WGLN (Wallenberg Global Learning Network)


RoboLearn - Online learning of turn-taking behaviour in spoken human-robot interaction
Funding: VR

In this project, we will investigate how a robot’s turn-taking behaviour can be learned from experience by interacting with people. [more]

S2S^2 - Sound to Sense, Sense to Sound
Funding: EU

Sound to Sense, Sense to Sound (S2S^2) is a FET-Open coordination action. The main aim of this action is the identification and writing of the research roadmap for the sound and music computing community in Europe. [more]

SAME - Sound And Music for Everyone, Everyday, Everywhere Every Way
Funding: EU

The SAME project aims at creating a new end-to-end networked platform for active, experience-centric, and context-aware active music listening. The project will answer questions like "In 5 years, what will correspond to the current iPod?"; "What kind of markets would such new devices open up?" [more]

SAMINK - Incremental processing in multimodal conversational systems
Funding: VR

The aim of this project is to understand how conversational systems can engage in spoken face-to-face interaction in real-time. [more]

SAMPAUS - Classifying and deploying pauses for flow control in conversational systems
Funding: VR

The project investigates how dialog system can employ pauses and fillers to help users better understand system utterance structure. [more]

SAMPROS - Prosody in conversation
Funding: RJ (Bank of Sweden Tercentenary Foundation)

The project investigates how people talking to each other jointly decide who should speak when, and the role of prosody in making these joint decisions. [more]

SAMRYTM - The rhythm of conversation
Funding: VR

The project Rhythm of conversation investigates how a set of rhythmic prosodic features contributes to the joint interaction control in conversations. [more]

SAMSYNT - Introducing interactional phenomena in speech synthesis
Funding: VR

The project will develop and verify ways of including interactional phenomena in speech synthesis, resulting in well-described and tested methods for synthesizing these phenomena in such a way that they can be employed to recreate human interactional behaviour. [more]

SAVIR - Situated Audio Visual Interaction with Robots
Funding: SRA/KTH

The projects investigate how a robot can improve its visual scene understanding by engaging in spoken dialogue with a human. [more]

Secrets of the violin -

Experimental models and electro-dynamical measurements on real violins. [more]

SEMIR - Bridging the semantic gap in Music Information Retrieval: Modelling perceptual-based features in music audio
Funding: VR

This project aims at deveoping new computer tools for characterizing, and indexing music audio. [more]

SID - Sonic Interaction Design
Funding: ESF (European Science Foundation)

Sonic Interaction Design is the exploitation of sound as one of the principal channels conveying information, meaning, and aesthetic/emotional qualities in interactive contexts. [more]

SIFT - Separating Intonation from Tone
Funding: Bank of Sweden Tercentenary Foundation


SIMULEKT - Simulating Intonational Varieties of Swedish
Funding: VR

This project represents collaboration between Lund University (Gösta Bruce and Susanne Schötz) and KTH (Björn Granström, Jonas Beskow, Joakim Gustafson and Laura Enflo) The primary goal of the research project is to produce more precise and thorough knowledge about phrase and utterance prosody in some major regional varieties of Swedish.... [more]

SingHab - Advanced objective measures of voice in professional female singers
Funding: Fondation Alma et Baxter Ricard, Canada

The long-term goal of this research project is to improve the evaluation of the singing voice in clinical settings. A tailored assessment of the singing voice is crucial to the effectiveness of rehabilitation work. The result of this research will be to provide useful and clinically accessible evaluation possibilities for... [more]

SkAT-VG - Sketching Audio Technologies using Vocalizations and Gestures
Funding: EU

The SkAT-VG project will develop a system that enables sound designers to sketch sounds using their own voice and gestures as input. The project partners are IUAV in Venice (coordinator Davide Rocchesso), IRCAM in Paris, KTH in Stockholm, and the company GENESIS S.A. in Aix-en-Provence. [more]

SOb - The Sounding Objects
Funding: EU

The SOb project aims at developing sound models that are responsive to physical interactions and are easily matched to physical objects. [more]

SOM - The sound of motion: Providing sound feedback to human movements
Funding: VR

The main aim of this project is the development of theories, models and tools for representing human movements by means of sound. This work is part of growing research fields known as data sonification, embodied music cognition and mediation technology. [more]

SONEA - Sonification of elite athletes
Funding: Olympic Performance Centre

The main idea in the SONEA project is that of applying knowledge in the perception of everyday sounds, in sound synthesis techniques, and in interactive sonification, to the auditory representation of the movements of top-level athletes. [more]

Song - Sundberg's Voice Science
Funding: KTH CSC - TMH Sundberg

Kulning - Hard rock - Twang - Belting - Chest/Falsetto - Whisper - High pitch singing - Text intelligibility - MRI [more]

Sound Clothes - Expressive sound-based monitoring of body motion through everyday clothes.
Funding: KTH CSC Small Visionary Projects

We want to use elements and affordances in everyday clothes as control interfaces for eyes-free interaction with music and sound. We do this by developing wearable controls that allow for scalable mapping of expressive body gesture in everyday situations, which can be translated into sounds (and sound parameters). Mapping must... [more]

SPEDIAL - Spoken Dialogue Analytics
Funding: EU


SpeechDat - Databases for the Creation of Voice Driven Teleservices
Funding: EU

Swedish telephone speech database containing 5000 speakers over the landline telephone network and 1000 speakers recorded over mobile telephone networks [more]

SPEECON - Speech Driven Interfaces for Consumer Applications
Funding: EU

Recording of 550 adult speakers and 50 child speakers were recorded with 4 different microphones in different environments, such as office, public place, home, outdoor and car. [more]

Spontal - Multimodal database of spontaneous speech in dialog
Funding: VR, KFI

Massively multimodal (HD video, hifi sound, and motion capture) database of spontaneous speech in dialog. [more]

SVP Voice - Detailed multiphysics simulation of human voice production with neural control - a feasibility study
Funding: KTH CSC

This project is a feasibility study in which we examine whether it is possible to make a unified-domain numerical simulation of human voice production that covers the mechanical, fluid and acoustic phenomena involved; and also attempts to control the simulation using representations of (simulated) muscle activation. [more]

SYMIR - Musical understanding for music information retrieval
Funding: VR


SynFace -
Funding: EU

SynFace is a technology that allows an animated talking face to be controlled by a speech signal in real time. [more]

The science and technology of music -
Funding: STINT (Swedish Foundation for International Cooperation in Research and Higher Education)


TIG - Timing of intonation and gestures in spoken communication
Funding: RJ (Bank of Sweden Tercentenary Foundation)

The goal of the project is to understand timing relationships between intonation and gesture in spontaneous speech. This will be investigated through semi-automatic extraction of co-speech gestures from a large and varied dataset (audio, video, motion-capture), and analysis of function and synchronization of speech and gestures. [more]

TIVOLI - Sign learning via game-based interaction
Funding: PTS - Post och Telestyrelsen

TIVOLI aims to create learning application for sign language signs, in the form of a computer game featuring sign recogntion via webcam and a signing avatar. The target group is children with communication disorders. [more]

VariQ - Intonational variation in questions in Swedish
Funding: VR

This project investigates questions in dialogue. What is a question, and what makes it into one? [more]

VATS - What turns speech into conversation?
Funding: VR

The project What turns speech into conversation? (Vad gör tal till samtal?) investigates features that are specific to conversations among humans - the very features that turn speech into conversation - such as how the speakers know when to speak and when not to. [more]

VEMUS - Virtual European music school
Funding: EU

VEMUS (Virtual European Music School) aims at developing an open, highly interactive, and networked multilingual music tuition framework for popular instruments such as the trumpet, the flute, and the clarinet. [more]

Ville - The Virtual Language Teacher
Funding: GSLT Graduate School of Language Technology

The goal of this project is to develop and test a new type of appliction for computer assisted language learning (CALL) and computer assisted pronunciation training (CAPT). [more]

VirtualRobot - Exploring situated interaction with social robots using augmented reality
Funding: SRA/KTH

In this project, we aim aim to explore the use of Augmented Reality (AR) to investigate the impact of multimodal behaviour (speech, facial expression, full-body motions, conversational formations) and embodiment on turn-taking and joint attention in human-robot interaction. [more]

VoiceMech - Numerical modelling of the non-vibratory biomechanics of human voice production
Funding: VR

In the VoiceMech project, a 3D biomechanical model of the non-acoustic parts of the human voice is in development, mainly to serve as a controlling interface for the aerodynamic/acoustic models that are under construction in the Eunison project. VoiceMech is conducted in collaboration with several others, notably the ArtiSynth project... [more]

VoxLog - VoxLog: portable voice analyzer

A new wearable voice+noise dosimeter has been developed. The project has assessed the validity, usability and commercial potential of the new device. [more]

Waxholm - The Waxholm dialog project

Waxholm was a spoken dialogue system for retrieving information about the ferryboat services in the Stockholm archipelago. [more]

WHEELIE - Motivational training of motor skills using the wheelchair as an expressive musical instrument
Funding: Promobilia Foundation


WikiSpeech -
Funding: PTS - Post och Telestyrelsen

An open source project that will draw on crowdsourced contributions to make Wikipedia more accessible by adding text-to-speech synthesis that will enable users of the online encyclopedia to have portions of the text read out to them. [more]

Workplace noise - The effects of workplace noise on speaking voice
Funding: FAS (Swedish Council for Working Life and Social Research)


WWAAC - World Wide Augmentative and Alternative Communication

The project will make several Internet services accessible for persons with language and/or cognitive impairments, including symbol users and a part of the elderly population. An universal coding system will be developed for message encoding which can be used for eMail, eChat or enhancing web-page information. This will be promoted... [more]

Published by: TMH, Speech, Music and Hearing

Last updated: Monday, 14-Oct-2019 11:12:30 MEST