GSLT Speech Synthesis
Lectures & Hands-on exercises
The objective of the lectures and tutorials is to give an overview of the history and state-of-the-art in speech synthesis.
This is a preliminary overview of the lectures and seminars in the
course.
Literature:
TTS = "Text-to-speech synthesis" by Taylor, P. (2007)
Klatt = Review of text-to-speech conversion for English by Klatt, D. (1987).
Preliminary lecture notes for Seminar days 1-2
Seminar day 1: 8/9
Lecture 1: Introduction and history
Content: Presentation and overview of the course. A short history of the development of early speech synthesizers and different synthesis techniques.
Literature: TTS 1 + Klatt
OVE Demo instructions
Content: Source and filter manipulation.
Lecture 2: Analysis and manipulation of speech
Content: Source and filters, Linear Prediction analysis, Cepstrums, Spectral envelope
Literature: TTS 12
Lecture 3: Concatenative speech synthesis
Content: Concatenation units, creating a unit database, the PSOLA and MBROLA algorithms
Literature: TTS 14
Hands-on 1: Diphone "synthesis"
Content: Computer exercise in creating and manipulating diphones for a simple sentence
Lecture 4: Unit selection synthesis
Content: Multiple entry databases, fixed or variable unit lengths, the Hunt and Black algorithm, target cost, join cost, concatenation cost, choice of features, search and pruning.
Literature: TTS 16, 17.1
Lecture 5: HMM synthesis
Content: Observation probabilities, delta coefficients, states and transistions, training, speaker adaptation, Swedish HMM synthesis
Literature: TTS 15
Lecture 6: From text
Content: Segmenting and decoding a text. Generating phonemes and prosody from text. Markup of texts for speech synthesis
Literature: TTS (2), 3, 4, 5, 6, 8, 9
Seminar day 2: 21/10
Lecture 7: Vocal tract models
Content: Creating a vocal tract model, corpus-based articulatory synthesis, tubes, area functions, equivalent circuit theory
Literature: TTS 7, 10, 11, 13.4
Hands-on 2: Vocal tract models
Content: Computer exercises using simple vocal tract models to synthesize vowels
Lecture 8: Formant synthesis
Content: Digital resonators, formants and bandwidths, synthesis by rule, data-driven formant synthesis
Literature: TTS 10, 13
Hands-on 3: Speaker characteristics in formant synthesis
Content: Computer exercises with the RULSYS TTS plug-in to Wavesurfer.
Lecture 9: Evaluation
Content: Different types of evaluation, intelligibility, naturalness, listening tests, mean opinion score, preference score.
Literature: TTS 17.2
Lecture 10: Speech synthesis of the future
Content:Speaker adaptation, emotions, concept-to-speech .
Seminar day 3: Closing Seminar: 15/12?
Content: Term paper presentations, home-assignment correction, course evaluation.
Course responsible: Olov Engwall, engwall@kth.se, 790 75 65
|