Contact




PhD course in Speech Synthesis

Autumn 2008


The course evaluation for 2008 is now available (in Swedish).

Introduction

Speech synthesis is a PhD course at the Department of Speech, Music and Hearing (TMH), CSC, worth 7.5 ECTS credits.

The course is intended for students with a basic knowledge of speech technology (the equivalent of a Graduate School of Language Technology level 1 course in Speech technology, see further Prerequisites below) who want to gain an applied understanding of different techniques for speech synthesis. The focus is on signal generation. Letter-to-phoneme conversion, parsing and lexical prosody will only be covered superficially.

1. Course Goals

After the course, the students should be able to
    * critically assess the possibilities and limitations of different synthesis techniques
    * calculate and evaluate concatenation costs in unit selection
    * apply knowledge of articulatory modelling to speech synthesis
    * plan and carry out small-scale evaluations of speech synthesis
    * present speech synthesis issues in writing and orally at an adequate scientific level
    * judge the value of scientific publications on speech synthesis

2. Structure and contents

The core of this course is individual work consisting of reading the course material, completing practical assignments and writing and presenting a term paper on an individually selected theme in speech synthesis.

The scheduled meetings consists of introductory seminars (2 days) and a closing seminar (1 day). The timetable is given at the Schedule page.

The following topics are treated in the course:

    * Introduction
    * Phonetic synthesis by rule
      * Generating and improving rules for targets, transitions, coarticulation, speaking style.
    * Concatenative synthesis
      * Unit size: diphone, triphone, unit selection synthesis
      * Concatenation: costs, handling discontinuities, signal processing
    * HMM-based synthesis
    * Articulatory modelling for synthesis
    * Speaker characteristics
    * Expressive speech synthesis
    * Letter-to-sound conversion, prosody generation
    * Evaluation techniques
The tentative content of each of the seminar days is given at the Lectures page.

3. Requirements

In order to pass the course the students must fullfill the above course goals, by:

    * Completing the practical exercises
    * Preparing and presenting a term paper, and reviewing two others
    * Demonstrating a mastery of the subject by active participation in the closing seminar discussions
Detailed requirements for each part of the course are given in the instructions.

4. Prerequisites

The course is aimed at students with a basic knowledge of speech technology (the equivalent of a GSLT level 1 course in Speech technology).

This signifies having a general overview of speech technology and some of its underlying theories and models, such as acoustic phonetics, text-to-speech synthesis and dialog systems. Concepts and theories summarized in the GSLT Speech technology course should be familiar.

Please contact Olov Engwall (Responsible for the course) if in doubt about your vs. the required level of knowledge.


Course responsible: Olov Engwall, engwall@kth.se, 790 75 65







Published by: TMH, Speech, Music and Hearing
Webmaster, webmaster@speech.kth.se

Last updated: Tuesday, 24-Jun-2008 14:28:55 MEST