Contact




GSLT Speech Synthesis

Lectures & Hands-on exercises

The objective of the lectures and tutorials is to give an overview of the history and state-of-the-art in speech synthesis.

This is a preliminary overview of the lectures and seminars in the course.

Literature:
TTS = "Text-to-speech synthesis" by Taylor, P. (2007)
Klatt = Review of text-to-speech conversion for English by Klatt, D. (1987).

Preliminary lecture notes for Seminar days 1-2

Seminar day 1: 8/9

Lecture 1: Introduction and history
Content: Presentation and overview of the course. A short history of the development of early speech synthesizers and different synthesis techniques.
Literature: TTS 1 + Klatt

OVE Demo instructions
Content: Source and filter manipulation.

Lecture 2: Analysis and manipulation of speech
Content: Source and filters, Linear Prediction analysis, Cepstrums, Spectral envelope
Literature: TTS 12

Lecture 3: Concatenative speech synthesis
Content: Concatenation units, creating a unit database, the PSOLA and MBROLA algorithms
Literature: TTS 14

Hands-on 1: Diphone "synthesis"
Content: Computer exercise in creating and manipulating diphones for a simple sentence

Lecture 4: Unit selection synthesis
Content: Multiple entry databases, fixed or variable unit lengths, the Hunt and Black algorithm, target cost, join cost, concatenation cost, choice of features, search and pruning.
Literature: TTS 16, 17.1

Lecture 5: HMM synthesis
Content: Observation probabilities, delta coefficients, states and transistions, training, speaker adaptation, Swedish HMM synthesis
Literature: TTS 15

Lecture 6: From text
Content: Segmenting and decoding a text. Generating phonemes and prosody from text. Markup of texts for speech synthesis
Literature: TTS (2), 3, 4, 5, 6, 8, 9

Seminar day 2: 21/10

Lecture 7: Vocal tract models
Content: Creating a vocal tract model, corpus-based articulatory synthesis, tubes, area functions, equivalent circuit theory
Literature: TTS 7, 10, 11, 13.4

Hands-on 2: Vocal tract models
Content: Computer exercises using simple vocal tract models to synthesize vowels

Lecture 8: Formant synthesis
Content: Digital resonators, formants and bandwidths, synthesis by rule, data-driven formant synthesis
Literature: TTS 10, 13

Hands-on 3: Speaker characteristics in formant synthesis
Content: Computer exercises with the RULSYS TTS plug-in to Wavesurfer.

Lecture 9: Evaluation
Content: Different types of evaluation, intelligibility, naturalness, listening tests, mean opinion score, preference score. Literature: TTS 17.2

Lecture 10: Speech synthesis of the future
Content:Speaker adaptation, emotions, concept-to-speech .

Seminar day 3: Closing Seminar: 15/12?

Content: Term paper presentations, home-assignment correction, course evaluation.


Course responsible: Olov Engwall, engwall@kth.se, 790 75 65







Published by: TMH, Speech, Music and Hearing
Webmaster, webmaster@speech.kth.se

Last updated: Tuesday, 24-Jun-2008 14:05:19 MEST