Data-driven Parametric Speech Synthesis

The ParSyn (Parametric Synthesis) project is focused on two challenges: 1) speech synthesis based on both datadriven and rule-based parametric modeling; 2) speech synthesis of spontaneous speech including disfluencies.

In the project we to combine corpus based methods for speech synthesis with knowledge based models and explore the strength of the two approaches. The project includes research on knowledge-based acoustic modeling of speech; development of new methods for speech analysis; creation of a database for Swedish speech synthesis; integration of a complete text-to-speech system based on the approach; and evaluation of this system. The project also includes the development of models of synthesis of breaks and hesitations.

Group: Speech Communication and Technology

Rolf Carlson
Kjell Gustafson

Duration: 1997 - 2007

Related publications:


Carlson, R., & Granström, B. (2007). Rule-based Speech Synthesis. In Benesty, J., Sondhi, M. M., & Huang, Y. (Eds.), Springer Handbook of Speech Processing (pp. 429-436). Springer Berlin Heidelberg.


Carlson, R., Gustafson, K., & Strangert, E. (2006). Cues for Hesitation in Speech Synthesis. In Proceedings of Interspeech 06. Pittsburgh, USA. [pdf]

Carlson, R., Gustafsson, K., & Strangert, E. (2006). Modelling hesitation for synthesis of spontaneous speech. In Proceedings of Speech Prosody 2006. Dresden. [pdf]


Carlson, R., & Granström, B. (2005). Data-driven multimodal synthesis. Speech Communication, 47(1-2), 182-193.


Strangert, E., & Carlson, R. (2004). On the modelling and synthesis of conversational speech. In Bruce, G., & Horne, M. (Eds.), Nordic Prosody. Proceedings of the IXth Conference (pp. 255-264). Lund: Peter Lang: Frankfurt am Main.

Vinet, R. (2004). Enhancing rule-based synthesizer using concatenative synthesis. Master's thesis, KTH, TMH, CTT. [pdf]

Öhlin, D. (2004). Formantextraktion för datadriven formantsyntes. Master's thesis, KTH, TMH, CTT. [pdf]

Öhlin, D., & Carlson, R. (2004). Data-driven formant synthesis. In Proc of the XVIIth Swedish Phonetics Conference, Fonetik 2004 (pp. 160-163). Stockholm University. [pdf]


Carlson, R., Sigvardson, T., & Sjölander, A. (2002). Data-driven formant synthesis. Proceedings of Fonetik, TMH-QPSR, 44(1), 121-124. [pdf]

Sigvardson, T. (2002). Datadrivna metoder för parametersyntes - beskrivning av ett system och experiment med CART-analys. Master's thesis, KTH, TMH, CTT. [pdf]


Sjölander, A. (2001). Datadriven formantsyntes. Master's thesis, KTH, TMH.


Högberg, J. (1997). Data driven formant synthesis. In Kokkinakis, G., Fakotakis, N., & Dermatas, E. (Eds.), Proc of Eurospeech Õ97, 5th European Conference on Speech Communication and Technology (pp. 565-568). Rhodes, Greece.

Högberg, J. (1997). Prediction of formant frequencies from linear combinations of filterbank and cepstral coefficients. TMH-QPSR, 38(4), 041-049. [pdf]

Published by: TMH, Speech, Music and Hearing

Last updated: 2012-11-09