Principles of speaker characterisation
Main responsibility for the research area:
TeknD Mats Blomberg
Speaker characterization is a domain of knowledge of critical importance to a number of speech technology applications.
It is a known fact that human beings are very skilled at recognizing speakers, and that they adapt rapidly to the
individual or dialectal pronunciation of different speakers. As of yet, no corresponding technical methodology
exists for characterizing speakers based on brief utterances. Instead, what is often used is a statistical description
that models a whole group of speakers. In such a process, individual differences between speakers are reduced,
which decreases the accuracy of decisions.
There is a need to reduce the amount of speech data for speaker adaptation. For this purpose, techniques for
fast adaptation based on speaker characteristic features will be investigated. The adaptation techniques can also
be used for adjustment of synthetic speech to individual natural voices. A variant of using speaker characterization
for speech recognition is to exploit speaker coherence, the fact that the speaker remains the same throughout the
utterance, something that is not done in standard HMM systems.
During the past few years, speaker verification has attracted a substantially increased amount of attention.
One reason is the predicted increase in speech recognition-based personal services over the telephone, where speaker
verification can become an important component in security. The reading aloud of, e.g., electronic mail and bank
account information over the telephone also requires security, to prevent the information from being accessed or
delivered by an intruder. In such cases, the speaker's voice is already available to be used also for identity
Important Research Topics for Stage 3
- improvements of methods for speaker verification over the telephone
- develop methods for fast speaker adaptation and utilisation of speaker coherence