Siyang Wang, Gustav Eje Henter, Joakim Gustafson and Éva Székely
mel-spectrogram |
HuBERT |
wav2vec2.0 L12 |
wav2vec2.0 L9 |
mel-spectrogram |
HuBERT |
wav2vec2.0 L9 |
All rights reserved by authors of the paper "COMPARING SELF-SUPERVISED SPEECH REPRESENTATIONS FOR READ AND SPONTANEOUS TTS" (in submission).
All audio samples are created by the authors.
These are for academic research purpose only.
Redistribution or reuse of any material shown on this website or in the paper is prohibited.