Preben Wik - GSLT courses

home

Speech Technology - Fall 2004 - Preben Wik, Anna Hjalmarsson, Gunilla Svanfeldt

Speech recognition

In order to learn more about speech recognition, we installed the HTK toolkit and followed the tutorial as well as we could. Our initial goal was to build a simple recognizer trained on a single user. All three of us followed the tutorial individually with the minimal vocabulary "one", "two" and "three". Two of us tried to model words as single units while one of used models on the phoneme level. Both types of models seamed to work after a lot of time and pretty much help from colleagues that have previous experience from working with HTK. We experienced that time could have been saved if many of the files we generated by HTK scripts had been written by hand. Moreover, we would probably have grasped the concept of HTK faster if we would not have been forced to follow the numerous and confusing steps of the HTK tutorial.

Before installing HTK, we already had plans to make a calculator as an application for both speech recognition and speech synthesis. With this in mind, one of us expanded the vocabulary to include numbers from zero to nine as well as operators like "plus", "minus" and "equals". The idea was to record the words in the vocabulary guided by a specific recording application [reference] which allows manipulation of the recordings. These recordings can later be used both for training the acoustic word models for the recognizer and for the concatenation synthesis in order to make your own personalized calculator.

It was interesting to get some hands-on experience, but it was unfortunate that we did not have the opportunity to have hands-on labs with access to a teacher, since it would probably had made things a lot smoother.

A Speech Recognition Calculator...

Only recognizes numbers 0-9, + - * and =
and it is only trained on one persons voice...

miniCalc

An Exe file for windows can be downloaded here
miniCalculator.exe

A Talking Calculator...

A Talking Calculator

An Exe file for windows can be downloaded here
mini.exe
A rec folder with recordings for the talking calculator rec.zip
Uses the same recordings as the sythesis



Home