Course in Speech and Speaker Recognition

Spring Semester 2007

Purpose

The purpose of this 5 p doctoral course is to give students with basic knowledge of speech technology a deeper understanding of techniques for speech and speaker recognition.

The course consists of lectures, practical assignments, exercises and the writing of a term paper on an individually selected topic.

The following topics are treated in the course:

Probability Theory
Pattern Recognition Techniques
Signal Processing and Feature Extraction
Hidden Markov Models - training and decoding.
Understanding the equations rather than deriving them.
Acoustic Modeling,
Adaptation to Speaker and Environment
Language Models
Basic and Large-vocabulary search strategies
Speaker Recognition

Prerequisites

The course is aimed at students with a basic knowledge of speech technology (the equivalent to a GSLT level 1 course in Speech technology). Basic programming skills are useful as well as knowledge of basic statistics and probability theory.

Schedule

The lecture dates are preliminary and can be slightly changed according to the participants preferences. Fridays are tentatively selected. Please inform Mats if you think other lecture days would fit better. A third lecture day can be added in order to give the topics more detailed presentation. This decision will be taken in consultation with the students during the first lecure day. For example, the second occasion might be prolonged to two days.

Lectures

Date	Time	Room	Contents
Thursday, March 29	10-12, 13-15 PROGRAM 15-18	Fantum, TMH Office rooms	Introduction Probability, Statistics and Information Theory Pattern Recognition Speech Signal Representations Hidden Markov Models HTK tutorial, HTK lab
Friday, March 30	9-12, 13-15 PROGRAM 15-16	Fantum, TMH Seminar room/ office rooms	HMM Training and Adaptation Acoustic Modeling Environmental Robustness Computational exercises
Friday, May 11	10-12, 13-15 Preliminary program 15-17	Fantum, TMH	Speaker Recognition Language Modeling Basic and Large Vocabulary Search Finite State Transducers Short Demonstration of Dictation
Friday, June 08	10-11 11-12, 13-15, 15-16 Program	Fantum, TMH	Presentation solution to exercises Presentations of term papers Discussion

The final lecture notes will be posted after each lecture.

Dead-lines

Date	Content
April 18	Select topic for term paper
May 2	Select two papers to review
May 7	Mail exercise solutions to teacher
May 20	Mail draft paper to reviewers
May 25	Reviewers return comments to author
June 1	Mail final paper to teacher and the reviewers

Reading material

The main course book is Huang, Acero and Hon (2001): Spoken Language Processing (Prentice Hall, ISBN 0-13-022616-5).
The course will mainly cover chapters 3, 4, 5 (partly), 6, 8, 9, 10, 11, 12, and 13 (if time). Since the book doesn't include speaker recognition, this literature will be in the form of selected papers.
A selection of papers will be used as additional reading material for topics not covered in the book.

Requirements

In order to pass the course the students must:

Complete the practical assignment and the exercises
Prepare and present a term paper
Participate actively in the discussions in the closing seminar

Practical assignments and exercises

A practical assignment will use the recognition software package HTK and will consist of building a simple recognition task and performing training and evaluation.
Exercises on speech recognition problems will be presented during the first lecture. These will also be downloadable from Exercises. Solutions will be presented during the closing seminar.

Term paper

During the course a term paper shall be prepared by each student and be presented during the closing seminar. The paper shall be reviewed by two fellow students. Choose a topic after discussion with the teachers. This can be an idea of your own, related to your own work or selected from the list below.

Topic suggestions:
Perform recognition experiments with HTK and report results
Limitations in standard HMM and alternative approaches
Pronunciation variation and its importance for speech recognition
Language models for speech recognition
Search methods
Techniques for robust recognition of speech
Confidence measures in speech recognition
The role of prosody for speech recognition
Speaker recognition
More topics can be added during the course

Chosen term paper topics and assigned reviewers

Author	Preliminary Title	Reviewer	Reviewer
Ansis Berzins
Maria Eskevich	Pronunciation variation and its importance for speech recognition	Lisa	Harald
Vera Evdokimova	Automatic recognition of emotions and physical state of the speaker	Maria	Harald
Lisa Gustavsson	Creating an automatic model of speech imitation	Daniil	Vera
Harald Hammarström	Machine Learning Experiments on Speech-to-Phoneme Classification using Cepstrum Coefficients	Jonas	Andrejs
Daniil Kocharov	The use of articulatory features for speech recognition	Lisa	Anton
Jonas Lindh	Automatic Aligning of Swedish in Praat using HTK HVite Function	Valentin	Vera
Anton Ragni	Subword Language Modelling Using Morphological Units Induced from Lexicon Automata (Oral presentation on May 11)	Daniil	Valentin
Valentin Smirnov	Phonetic Modelling in ASR (Russian speech) - the impact on performance	Anton	Jonas
Andrejs Vasiljevs	First experiments on Latvian ASR with HTK toolkit	Maria	?

Closing seminar

The closing seminar includes:

Presentation of solutions to the exercises
Presentation and discussion of term papers

Teachers

Mats Blomberg matsb@speech.kth.se http://www.speech.kth.se/~matsb
Kjell Elenius kjell@speech.kth.se http://www.speech.kth.se/~kjell
Dept. Speech, Music and Hearing, School of Computer Science and Communication, KTH (Royal Institute of Technology)Lindstedtsvägen 24
SE-100 44 Stockholm, Sweden

How to get to TMH and some travel information
http://www.speech.kth.se/info/location.html

Accomodation
Hotel Arcadia is close to KTH, 10 min walking distance from TMH, and offers lower price for KTH guests. The current single room price is 773 SEK. Adress: Körsbärsvägen 1, http://www.elite.se/eng/hotell/stockholm/arcadia/ .
A few other hotels in the vicinity of KTH are
Hotel Oden, Karlbergsv. 24, www.hoteloden.se
Hotel Brunnen, Surbrunnsg. 38, www.hotelbrunnen.se

Some low cost hotel and hostel alternatives are
Hostel Bed and Breakfast, Rehnsg. 21, www.hostelbedandbreakfast.com
Hostel Fridhemsplan Vandrarhem STF, S:t Eriksg. 20, www.fridhemsplan.se
Vanadis Hotell o. Bad www.vanadishotel.com