PhD course in Speech Synthesis
Autumn 2008
The course evaluation for 2008 is now available (in Swedish).
Introduction
Speech synthesis is a PhD course at
the Department of Speech,
Music and Hearing (TMH), CSC, worth 7.5 ECTS credits.
The course is intended for students with a basic knowledge of speech technology (the equivalent of a Graduate School of Language Technology level 1 course in Speech technology, see further Prerequisites below) who want to gain an applied understanding of different techniques for speech synthesis.
The focus is on signal generation.
Letter-to-phoneme conversion, parsing and lexical prosody will only be covered superficially.
1. Course Goals
After the course, the students should be able to
* critically assess the possibilities and limitations of different synthesis techniques
* calculate and evaluate concatenation costs in unit selection
* apply knowledge of articulatory modelling to speech synthesis
* plan and carry out small-scale evaluations of speech synthesis
* present speech synthesis issues in writing and orally at an adequate scientific level
* judge the value of scientific publications on speech synthesis
2. Structure and contents
The core of this course is individual work consisting of reading the course material, completing practical assignments and writing and presenting a term paper on an individually selected theme in speech synthesis.
The scheduled meetings consists of introductory seminars (2 days) and a closing seminar (1 day).
The timetable is given at the Schedule page.
The following topics are treated in the course:
* Introduction
* Phonetic synthesis by rule
* Generating and improving rules for targets, transitions, coarticulation, speaking style.
* Concatenative synthesis
* Unit size: diphone, triphone, unit selection synthesis
* Concatenation: costs, handling discontinuities, signal processing
* HMM-based synthesis
* Articulatory modelling for synthesis
* Speaker characteristics
* Expressive speech synthesis
* Letter-to-sound conversion, prosody generation
* Evaluation techniques
The tentative content of each of the seminar days is given at the Lectures page.
3. Requirements
In order to pass the course the students must fullfill the above course goals, by:
* Completing the practical exercises
* Preparing and presenting a term paper, and reviewing two others
* Demonstrating a mastery of the subject by active participation in the
closing seminar discussions
Detailed requirements for each part of the course are given in the instructions.
4. Prerequisites
The course is aimed at students with a basic knowledge of speech
technology (the equivalent of a GSLT level 1 course in Speech
technology).
This signifies having a general overview of speech technology and some of its underlying theories and models, such as acoustic phonetics, text-to-speech synthesis and dialog systems.
Concepts and theories summarized in the GSLT Speech
technology course should be familiar.
Please contact Olov Engwall (Responsible for the course)
if in doubt about your vs. the required level of knowledge.
Course responsible: Olov Engwall, engwall@kth.se, 790 75 65
|