On the Use of Self-Supervised Speech Representations in Spontaneous Speech Synthesis

Speech Synthesis Workshop 2023 (SSW23)

Siyang Wang, Gustav Eje Henter, Joakim Gustafson, Éva Székely

Audio samples from listening tests are presented below.


--------------------------------------------------------------------------

Spontaneous corpus 1: TSGD


data2vec-base

data2vec-base-asr

wav2vec2.0-base

wav2vec2.0-base-asr

whisper-small

wavlm-base-plus

layer 6

layer 9

layer 12

--------------------------------------------------------------------------

Spontaneous corpus 2: TCC


data2vec-base

data2vec-base-asr

wav2vec2.0-base

wav2vec2.0-base-asr

whisper-small

wavlm-base-plus

layer 6

layer 9

layer 12

--------------------------------------------------------------------------

All rights reserved by authors of the paper "On the Use of Self-Supervised Speech Representations in Spontaneous Speech Synthesis" (SSW23).

All audio samples are created by the authors.

These are for academic research purpose only.

Redistribution or reuse of any material shown on this website or in the paper is prohibited.