On the Use of Self-Supervised Speech Representations in Spontaneous Speech Synthesis

Speech Synthesis Workshop 2023 (SSW23)

Siyang Wang, Gustav Eje Henter, Joakim Gustafson, Éva Székely

Audio samples from listening tests are presented below.

Default volume

--------------------------------------------------------------------------

Spontaneous corpus 1: TSGD

	data2vec-base	data2vec-base-asr	wav2vec2.0-base	wav2vec2.0-base-asr	whisper-small	wavlm-base-plus
layer 6
layer 9
layer 12

--------------------------------------------------------------------------

Spontaneous corpus 2: TCC

	data2vec-base	data2vec-base-asr	wav2vec2.0-base	wav2vec2.0-base-asr	whisper-small	wavlm-base-plus
layer 6
layer 9
layer 12

--------------------------------------------------------------------------

All rights reserved by authors of the paper "On the Use of Self-Supervised Speech Representations in Spontaneous Speech Synthesis" (SSW23).

All audio samples are created by the authors.

These are for academic research purpose only.

Redistribution or reuse of any material shown on this website or in the paper is prohibited.