Annual Report 1999

Anders Askenfelt


Communication and interaction between humans via speech and music is the unifying theme of research and teaching within the Department of Speech, Music and Hearing. The department is engaged in a diverse set of multi-disciplinary activities, commonly classified into speech communication, speech technology, speech coding, music acoustics, and auditory perception, to mention the largest areas.

In speech research, projects range from the development of theoretical models of speech production through phonetic analyses to practical applications of speech technology. Multimodal dialogue systems, aiming at fast and efficient communication between a computer and a human via speech and vision, are rapidly gaining a central role in speech technology. In these systems, many of the classical topics in speech research such as synthesis, continuous speech recognition, and speaker characterisation are brought together and tested in practice. A successful implementation has been the animated talking agent August , from which considerable experience on man - machine dialogue has been gained.

Speech signal processing presents a palette of projects of vital importance to modern consumer products like mobile telephony, home audio and Internet services. Coding and compression of speech and audio are the central topics. Coding strategies based on models of our hearing (auditory modelling) has potentials both for low bit-rate coding as well as for coding algorithms with improved sound quality. Auditory modelling approaches may also work efficiently for coding voice and audio for transmission over the Internet.

In music acoustics, the physics of the traditional musical instruments, such as strings and winds, is a classical topic which is still far from completely explored. The voice holds a unique position both as a musical instrument in singing as well as the acoustical carrier in speech. A current topic of common interest to speech and voice research is articulatory modelling, aiming at a more natural synthesis. In this approach the positions and movements of the articulatory organs are modelled, rather than the parameters in the acoustic domain such as spectral characteristics. In studies of music performance, emotions are attracting more attention. Emotional colouring can raise the "artistic level" of a computer performance by enhancing the mood of the piece.

Research and development within hearing technology is centred around improved electronic aids for hearing-impaired persons. Multimodal sensory aids, such as a talking, animated agent which provides lip-reading support over a telephone connection as a support to voice information, is an example of a current project shared by the hearing and speech technology groups. Digital hearing aids, which adapt automatically to varying listening conditions, can improve listening comfort in everyday life, for instance by suppressing background noise. Parallel achievements in several basic research areas are necessary for the development of advanced hearing aids, including theoretical models of auditory perception and the perception of sounds by the hearing impaired, in particular speech.

As before we are heavily dependent on external funding. The research at the department continues to attract considerable interest from the Swedish telecom industry. The major partners are Telia and Ericsson, each of which supports an academic chair. The Centre of excellence in Speech Technology (CTT), which is associated with the department, is an important organisational structure in our research contacts with industry. The centre, which includes 14 partners from Swedish telecom and IT-industry, is operated under support from Nutek (Swedish National Board for Industrial and Technical Development) and KTH.

Other sources for research funding include the major Swedish research councils, international telecom industry, the European Commission, and private foundations, in particular the Bank of Sweden Tercentenary Foundation.

The vivid activity at the department is reflected in a high output of reports. During the past year, 12 articles were published or accepted in international scientific journals with peer reviewing. In addition, our quarterly research report (TMH-QPSR), with a world-wide edition of 925 copies, provides the latest news concerning research at the department. More than 50 presentations were given at international conferences.

The 7th European Summer School on Language and Speech Communication (MiLaSS) was organised by the department during two weeks in July. The event attracted doctoral students from countries all over the world.

The size of the department seems now to have reached a more stable state. At the end of 1999, a total of 66 researchers and research students and eight administrators were employed at the department. The turnover during 1999 was 36 MSEK including CTT.

Over the years since the founding of the department, research funding has been a continuous concern that engages professors and project leaders heavily. It is true that the broad range of topics covered at the department requires support from a diverse set of research foundations, industry partners and other sources. However, more stable, long-term funding for some central research areas has been, and still is, a highly desired goal. In the near future, the Swedish research foundations will be re-organised. The future conditions for support to speech-related research will be of vital importance for the department.

In October the founder of the Department, Professor Gunnar Fant, celebrated his 80th birthday. A full-mustered delegation from the department invaded his home and presented him with a brand new portrait and copies of films showing similar ceremonies on his 50th and 60th birthdays. Inspections of the new and a 30-year-old portrait revealed only minor differences. A further analysis based on comparisons between highlights from the films and his present-day appearance gave a similar result. A few days later Gunnar was back at work at the department as usual.

At a large department, colleagues naturally leave every year and new ones arrive. Sometimes, however, the fragile conditions for health and human life are brutally exposed, and the farewell is forever.

In March 1999, our colleague and dear friend Lennart Nord passed away at the age of 52. His premature departure from this life touched us deeply. Lennart Nord had been with the department since 1970. His research interests were in acoustic phonetics, in particular vowel reduction, and in pathological speech. His thorough insights in the production of speech made him an appreciated teacher in phonetics. For many years he was involved in the education of speech and language pathologists at the Karolinska Institute, giving the students a firm basis in the acoustic secrets of the speech sounds. We remember him as a warm and kind person with deep interests in the humanities and remarkable musical skills on a diverse set of instruments. We miss him a lot.

