COST250 Speaker Recognition Reference System

6. The cost250 Tcl package

This chapter describes the commands provided by the cost250 Tcl package and how to use them. The command package require cost250 gives you access to all commands in the package.


6.1 Using the package

For an example of how to use the commands in this package, see the file work/runtest.tcl (click here to view it in a separate browser window). It contains a program that runs a complete experiment with the Reference System. The commands contained in the cost250 package are printed with red, bold characters in the example. The remainder of the example program is Tcl code and comments. The program starts by creating a database and an engine object and then prints the properties of the objects. It then executes the operations defined in three separate experiment definition files. Those files are supposed to define 1) training of a non-client model, 2) enrollment of clients, and 3) verification attempts. The program finally deletes the created database and engine objects and terminates.

The example program assumes that the experiment definition files are organized according to the following convention. A set of three experiment definition files are identified by a tag and are supposed to be stored in a directory ../experiment/tag. The three files should have names

The tag has the format database/experiment. The calibration test on Polycost is for instance identified by the tag polycost/test, and the first baseline experiment has the tag polycost/be1. The given example program takes the experiment tag as its only command line argument, and it can be used with any new experiment, as long as the experiment definition files follow this convention.

The commands are described below from an object-oriented point-of-view. A description of the object-oriented programming style we have tried to follow with Tcl can be found here.


6.2 Experiment class

The single command in the Experiment class has the following syntax:

Experiment::run experimentFileName -database databaseName -system engineName ?-outdir modelDir? ?-resultfile outfile?

This command will perform all operations listed in the file experimentFileName. The given database and engine object will be used to implement the operations. Enrolled models are saved in directory modelDir and results are saved in outfile. Default value for modelDir is the current directory and the default value for outfile is results.llk in the current directory.

The required format for the experiment definition file is described in section File Format Description.

The source code for the Experiment class can be found in tcl/runexp.tcl in the source distribution.


6.3 Database object

The purpose of the database class is to provide the recognition engine with speech data. The current database class is very simple and is suitable for databases where all speech files are stored in any subdirectory of a single base directory, such that each speech file in the database has a filename

datadir / fileTag . alwExt

where datadir is the base directory, fileTag uniquely identifies a single speech file, and alwExt is the filename extension. Polycost satisfies this condition if all subject directories are copied to disk into the same directory. It is not satisfied with speech data being stored on two separate CR-ROMs. Furthermore, the Reference System recognition engine currently assumes that the speech file supplied by the database object is a headerless, 8 kHz A-law file. The current task of the database class is to map a file tag onto a complete speech filename. This should be changed in future versions of the Reference System to enable the use of other speech file formats with other databases than Polycost; the database object should return the actual, decoded, speech data instead of the name of a file.

A database object is created with the command

Database::new datadir

It returns the name of the created object, databaseName, defined in the global namespace. The name will be something like "::db01". Multiple database objects can be created and they will all have unique names.

This is the typical life cycle of a database object:

  1. The object is created by the Database::new command.
  2. Default property values are optionally overridden with the set method.
  3. The init method is called after all properties have been set.
  4. The database object is passed as an argument to one or more Experiment::run commands and subsequently to a recognition engine object. The engine object will call the database object's filename method to map a speech file tag to a filename.
  5. The destroy method is called when the object is no longer needed.
The following commands can be used to operate on a database object. The command name is the name of the object itself (as returned by Database::new) and the first argument is the method name. Any remaining arguments are arguments to the method.

databaseName set propertyName value

databaseName init databaseName print ?channel? databaseName getdir databaseName filename fileTag databaseName destroy The source code for the Database class can be found in tcl/polycost.tcl in the source distribution.

property default description
alwExt .alw the filename extension for speech files in the database
trace 0 currently not used.
Table 1. Properties in a Database object.


6.4 Recognition engine object

The purpose of the recognition engine is to perform enrollment and verification operations. The engine class included in this package is called ReferenceSystem. It is the heart of the COST250 Speaker Recognition Reference System. With the default setup of the engine (note that only when the default setup is used the system is truly a reference system!) it implements a recognizer with the following characteristics:

A reference recognizer engine object is created with the command

ReferenceSystem::new databaseName

where databaseName is the name of a database object from which speech files can be retrieved. The command returns the name of the created object, engineName, defined in the global namespace. The name will be something like "::rec01". Multiple engine objects can be created and they will all have unique names.

This is the typical life cycle of an engine object:

  1. The object is created by the ReferenceSystem::new command.
  2. Default property values are optionally overridden with the set method.
  3. The init method is called after all properties have been set.
  4. The engine object is passed as an argument to the Experiment::run command. This command will call the engine object's enrollment and verification methods to perform a list of operations.
  5. The destroy method is called when the object is no longer needed.
The following commands can be used to operate on an engine object. The command name is the name of the object itself (as returned by ReferenceSystem::new) and the first argument is the method name. Any remaining arguments are arguments to the method.

engineName set propertyName value

engineName init engineName print ?channel? engineName enrollment identity fileTags outDir engineName verification speaker identity fileTags resultChannel engineName destroy The source code for the ReferenceSystem class can be found in tcl/vqst.tcl in the source distribution.

property* default description
lpcConfig* -f200 -o60 -p12 -a0.97 -s1 configuration options to lin2parProgram. The available configuration options are described in Table 3.
cbSize* 64 size of codebook to create during an enrollment operation
normalization* 1 1 = use score normalization; 0 = no score normalization
nonClientModels* W a list of non-client model names separated with space (for example: "W", or "F M")
binDir "" where executable VQST files reside. May be empty ("") if they are in the current search path.
alw2linProgram* alw2lin binary program that decodes A-law sample to linear scale
lin2parProgram* lpccep binary program that parameterizes a speech file
concatProgram* concat binary program that concatenates parameterized files
gencbProgram* gencb binary program that trains a speaker model (codebook)
vqtestProgram* vqtest binary program that computes a score for a test utterance against a speaker model
tmpDir /tmp where to create temporary files
cliDir cli where to find client model files (codebooks)
refDir ref where to find non-client model files (codebooks)
linearExt .lin filename extension for sample files with linear scale
paramExt .lpc filename extension for parameter files
modelExt .cb filename extension for client and non-client model files (codebooks)
uniqTag set automatically by the new command
lpcConfigFileName set automatically by the new command
trace 0 Set to 1 for trace output; 0 for no trace output.
Table 2. Properties in a ReferenceSystem object. *These properties must have their default values for the system to perform as a reference system in the sense that it produces the same results in all sites.

option default description
-f 256 the speech frame size in samples
-o 50[%] the overlap between the neighbouring speech frames in percentage
-p 12 the LPC-cepstrum vector size
-t 0.01 the absolute energy threshold: this parameter is used to discard any speech frame whose absolute average energy is less than the given value
-a 0.95 the pre-emphasis coefficient
-h 0 header size (in number of integers) : its default value is "0" meaning use all the speech samples in the input file. If it is set to any non-zero value, that many samples at the top of the speech file will be ignored. Note that when the alw2lin program is used to produce the input to the LPC-cepstrum program, header size must be 0.
-s 0 a trace information parameter: if set to zero, info on various parameter values will be printed. If set to "1" no parameter values are printed (silent mode).
Table 3. Configuration options for the LPC-cepstrum program.