COMPARING SELF-SUPERVISED SPEECH REPRESENTATIONS FOR READ AND SPONTANEOUS TTS (in submission)

Siyang Wang, Gustav Eje Henter, Joakim Gustafson and Éva Székely

Default volume

--------------------------------------------------------------------------

mel-spectrogram	HuBERT	wav2vec2.0 L12	wav2vec2.0 L9

mel-spectrogram	HuBERT	wav2vec2.0 L9

All audio samples are created by the authors.

These are for academic research purpose only.

Redistribution or reuse of any material shown on this website or in the paper is prohibited.