Resources for Automatic Speech Recognition

KTH Royal Institute of Technology,
School of Computer Science and Communication,
Dept. for Speech Music and Hearing

This page collects legacy resources for Automatic Speech Recognition (ASR) that the group has developed and made freely available to other researchers.

PI: Giampiero Salvi

Main contributors: Giampiero Salvi, Niklas Vanhainen

The WaveSurfer Automatic Speech Recognition Plugin

The plugin is described in the following paper:

Salvi, G., & Vanhainen, N. (2014). The WaveSurfer Automatic Speech Recognition Plugin. In Proceedings of LREC. Reykjavik, Iceland. [pdf]

You can find the source code and installation instructions at https://github.com/giampierosalvi/ASRWavesurferPlugin
The latest release is at https://github.com/giampierosalvi/ASRWavesurferPlugin/releases
Language pack for Swedish
Language pack for English

HTK based WaveSurfer Automatic Speech Recognition Plugin

This is an earlier version of the plugin above. The difference is that this version is based on HVite (HTK). This plugin is still used for educational purposes in the DT2112 Speech Technology course at KTH.

hvite.plug
To install, run wavesurfer once to create the configuration directory, and then copy the file into ~/.wavesurfer/1.8/plugins/
No language and acoustic models are defined, so the plugin is relatively useless outside the scope of the course.
Note: you will need a working version of HTK installed on your system for this to work.

ASR models for Swedish (Version 0.1)

This is a collection of acoustic and language models that were used in the paper:

Vanhainen, N., & Salvi, G. (2014). Free Acoustic and Language Models for Large Vocabulary Continuous Speech Recognition in Swedish. In Proceedings of LREC. Reykjavik, Iceland. [pdf]

package: asr-models-swedish-0.1.tgz (2.1GiB, MD5: a9ddfce26772ee5590f3650a1b2341ae)
language: Swedsih
format: HTK, Julius
sampling rates: 8/16kHz
features: MFCC/PLP
acknowledgements: this research has been supported by CLT (Gothenburg Center for Language Technology)
Test sets definition for the LREC 2014 experiments: testlistsLREC2014.tgz (109KiB, MD5: 2fa941ef10a4d1d36f3fb015ec2110b7)

This work is licensed under a Creative Commons Attribution 3.0 Unported License.

Resources for Automatic Speech Recognition

KTH Royal Institute of Technology, School of Computer Science and Communication, Dept. for Speech Music and Hearing

The WaveSurfer Automatic Speech Recognition Plugin

HTK based WaveSurfer Automatic Speech Recognition Plugin

ASR models for Swedish (Version 0.1)

KTH Royal Institute of Technology,
School of Computer Science and Communication,
Dept. for Speech Music and Hearing