Guidelines for experiments on the POLYCOST database

Håkan Melin, Johan Lindberg
KTH/Centre for Speech Technology (CTT)

Version 2.0; December 20th, 1999

(View version 1.01 of this document)

The most recent version of this document can be found at http://www.speech.kth.se/cost250/polycost/be/latest.


Abstract

The purpose of this document is to define a common ground for speaker recognition experiments on the POLYCOST database. It is done by defining a set of baseline experiments for which results always should be included when presenting evaluations made on this database. By including these results and by presenting the differences introduced in new experiments, a comparison between systems tested on different sites is made possible.

Four baseline experiments are defined: text-dependent speaker verification (SV) on a fixed password sentence in English, text-prompted SV on a digit sequence spoken in English, text-independent SV on free speech in the subject's mother tongue, and finally text-independent speaker identification on the same free speech. The definition of a baseline experiment includes the definition of client and impostor speakers and speakers for training a non-client model; sessions for enrollment and test; which speech items to use and how to compute and present results.

The present version (2.0) of this document is a revision from version 1.0. Some major changes have been made to make the experiments more difficult and to reduce the number of impostor attempts. The specification of how to compute and present results has also been changed. The experiments were made more difficult by decreasing the amount of enrollment data; The number of impostor attempts was reduced by removing less important tests, such as cross-sex and cross-language impostor attempts; and speaker verification results with a posteriori thresholds are now to be computed with speaker-independent rather than speaker-dependent thresholds.

A consequence of the introduced changes is that results from version 2.0 experiments are not directly comparable to results produced with the previous version. However, since the number of published results with the previous version is not so large, it is believed that the changes presented for the current version are important enough to motivate the changes.


Table of contents


1. Introduction

The purpose of this document is to define a common ground for speaker recognition experiments on the POLYCOST database. This is done by defining a set of baseline experiments for which results always should be included when presenting evaluations made on this database. By including these results and by presenting the differences introduced in new experiments, a comparison between systems tested on different sites is made possible.

This document and the guidelines it presents is a result of cooperative work by many researchers within various working groups (WG) in the COST250 action. The specification of standardized baseline experiments for public speech databases is central to good assessment of speaker recognition technology, especially in the context of cooperative and competitive research. This document therefore has its natural home in WG4 [7], which was dealing with assessment topics. The task of WG3 [8] was to research parameters and algorithms for speaker recognition, and this is where the actual experiments were run. POLYCOST and these specifications were used in several of the presented experiments. The POLYCOST database itself is a result of work in WG2 [9], dealing with databases for speaker recognition.

The first version of this document was published in January 1997. A number of experimental results have been presented within COST250 for POLYCOST and the first version of these guidelines. Some of them have also been published at open conferences, for instance [1][2]. It appeared that the results from those experiments looked very good. They turned out to be overly optimistic, though. In [2], results were presented for POLYCOST and two other databases (Gandalf and SESP) on similar recognition tasks, and the POLYCOST error rates were much lower. In [1] some of the particularities of POLYCOST were investigated, especially the importance of the fact that subjects in the database have different mother tongue. Error rates were roughly doubled if only same-language impostor attempts were used, compared to if also cross-language impostors were included. This effect was observed even for BE1 and BE2 where all subjects speak English. There was also a large difference in error rate between the three speaker verification baselines. BE1 and BE2 (text-dependent tasks) were much easier than BE3, the text-dependent task, and generated very few errors. When the enrollment sets were reduced for the two easier tasks, the absolute number of errors observed in the easier tasks increased accordingly. Note that for reliable comparison between two recognition systems, the number of observed errors must be large enough.

The above mentioned results lead to a suggestion to modify the specification of baseline experiments [1]. The suggestions were later accepted within COST250 and the current document describes the new version of the baseline experiments. These are the changes from version 1.0 to 2.0:

  1. only same-sex impostor attempts are now included
  2. only same-language impostor attempts are now included
  3. two sessions and one file per session is now used for enrollment in all experiments
  4. client-independent thresholds should now be used when calculating error rates based on a posteriori thresholds.
The two latter changes increase the absolute number of errors. The two former are not likely to increase the number of errors; however, they imply removing the easier tests that should not cause many errors anyway. The proportion of errors will therefore increase. Another positive effect from the two latter changes is to reduce the size of the test (without largely decreasing the number of errors). The number of impostor attempts defined for the verification tasks are reduced from 11990 to 824, a factor 14 reduction. The number of true-speaker attempts is still the same, 664.

Experiments with the new baseline specification confirms that the resulting error rates are higher and comparable to those from similar tasks on other databases [3].

The organization of this paper is as follows: section 2 shortly summarizes some features of the POLYCOST database. Section 3 defines the conditions for four baseline experiments, including the choice of task, speakers, enrollment material and test material. Some comments on the implementation of baseline experiments are then made on section 4. Finally, the computation and presentation of results are defined in section 5 .


2. The database

The POLYCOST database was recorded as a common initiative within the COST 250 action during January-March 1996. It contains around 10 sessions recorded by 134 subjects from 14 countries. Each session contains 14 items; 4 repetitions of a seven-digit client code (CLI), 5 ten-digit sequences (DIG), 2 fixed sentences (SEN), 1 international phone number (PHO), and 2 items with speech in the subject's mother tongue (MOT). The language in all items except the two last is English.

The baseline experiments defined in this document shall be applied to version 2.0 of POLYCOST (or, equivalently, version 1.0 (from July 96) with all 4 known bugs fixed). Version 2.0 of POLYCOST will soon be available through ELRA.

Go to the home page of POLYCOST for more information on the database.


3. Definition of experiments

A set of four baseline experiments is defined. Three of the experiments are on speaker verification tasks and the fourth on a closed-set speaker identification task. The experimental conditions were chosen to keep experiments realistic, well-defined and easy to implement. This section first covers some issues common to all baseline experiments and then defines the four experiments one by one. Table 3 summarizes details of the baseline experiments.

3.1 General guidelines

As far as possible, the four baseline experiments has been defined with equal conditions. For instance, the same speakers and sessions are always used in the test phase, and the same sessions and the same number of files are used for enrollment.

3.1.1 Choice of data for off-line training

A set of 22 speakers, as shown in Table 1, have been set aside for use as an off-line database. These speakers can be used, for instance, to build non-client models and to simulate impostor attempts when setting a threshold during enrollment of a new client speaker.

When building a non-client model, exactly sessions 01 to 05 for all off-line speakers shall be used (except for M045 where there is only one session). Which speech items to use is defined for the respective experiment in sections 3.2 to 3.5.

Off-line speakers were chosen according to the following criterion:

"Pick the male and the female speaker with the least number of recording sessions from each country".

Click here to see exceptions made to this rule.


Subject PIN
Country
#sessions
Language Code
M045 6592714
BE
1
French fr
F017 4172956
BE
7
French fr
M058 7925416
CH
5
French fr
M023 4762195
DK
6
Danish da
F049 7962451
DK
7
Danish da
M057 7691542
ES
9
Catalan ca
F011 2541679
ES
10
Spanish es
F050 9154276
FR
8
French fr
M005 1724695
FR
6
French fr
F030 5612497
IE
10
English en
M037 5972641
IE
10
English en
M059 7946215
IT
9
Italian it
F006 1956274
IT
10
Italian it
F039 6751942
NL
8
Dutch nl
M016 2941765
NL
9
Dutch nl
M018 2975461
PT
9
Portuguese pt
F044 7561249
SE
9
Swedish sv
M073 9671524
SE
9
Swedish sv
M010 2475916
TR
9
Turkish tr
F018 4295716
TR
8
Turkish tr
F058 9745216
GB
7
English en
M056 7625149
GB
7
English en
Table 1. The 22 speakers set aside to the off-line database. There are 12 male and 10 female speakers. The language entry was derived from listening to what the subject says in the MOT01 speech items.

3.1.2 Client speakers

All speakers in the database who have not been set aside as part of the off-line database and who have at least 5 sessions recorded, shall be used as client speaker in the experiments. This amounts to 110 speakers, where speakers F035 and M042 have been excluded because they have too few sessions.

True-identity tests shall be made on session 05 and later sessions. With existing sessions for the 110 client speakers, this gives 664 true-identity tests.

The reason for excluding speakers with less than 5 sessions and for using only sessions 05 and later for tests in all experiments is firstly to leave room for comparative experiments with more than two enrollment sessions with an invariant test material; and secondly to keep similarities to version 1.0 of the experiment specifications.

3.1.3 Impostor speakers

To simulate impostor attempts against speaker X in the speaker verification experiments, recordings from all speakers in the database except speaker X and the off-line speakers shall be used. Furthermore, only such impostor attempts where the impostor and the client have the same sex and the same mother tongue shall be used. For instance, only Swedish women shall be tried as impostors against another Swedish woman.

For the purpose of making an unambiguous definition of what impostor tests to include, a file specifying the mother tongue of each subject has been provided. The contents of this file are based on what the subject claims to be his mother tongue in the MOT01 items of the database. In case an entry in that file would turn out to have an error, the (unchanged) contents of the file is still to be used. The file has three columns: the subject identifier, a code for the country of origin of the subject's calls (or "country" according to MOT01 statements to be precise - this may in some cases be the subject's home country, which may be different from the country the calls were placed from; ISO3166 Country Codes are used), and a code for the language (mother tongue; ISO639 Language Codes are used).

Recordings from session 05 only shall be used for impostor tests. If a certain speaker has not recorded a session 05, this speaker will also not be used for impostor tests. This excludes speakers F035 and M042 and gives a total of 110 impostor speakers, the same set of speakers as those used as clients.

There are two reasons for choosing session 05 for impostor tests rather than for example 01 which would give more available impostor speakers. Firstly, it is assumed that later sessions will contain less speaking-errors because subjects are learning the recording protocol. Secondly, since session 05 is used for true-identity tests and as a test session in the speaker identification task, it is possible to compare the outcome of these with the outcome of impostor tests.

3.1.4 Number of tests

Given the available data in version 2 of the database (or version 1 with bug fixes) and the choice of client and impostor speakers as defined in the previous sections, the total number of true identity tests is 664 and the number of impostor tests is 824. These numbers apply to all speaker verification experiments (1-3), while only the true identity tests are relevant to the closed-set speaker identification task in experiment 4.

3.1.5 The use of annotation files

For all baseline experiments defined here, a test shall be made independently of the actual contents of the test file, that is, even if the manuscript utterance is not there or if the file contains only silence. This way, annotations for the test files are not needed, nor may be used, to produce a result on a given baseline experiment. The main reasons for this choice are to make experiments easy to implement and to not rely on annotations of test data. The fact that a portion of errors produced in an experiment will be due to speaking-errors rather than errors made by the classifier, should instead be related to during the analysis of test results. This is not yet done by the scoring software presented in section 5, however.

For enrollment, annotation and segmentation information may be used in BE2 as defined in section 3.3.1, but not in the other experiments. The reason for the choice of using annotation information in experiment 2 at enrollment time is again easy implementation of the experiments. The manually verified annotation files provided with the POLYCOST database should be used.

3.2 Experiment 1: Text-dependent speaker verification on SEN-files

The task in this experiment is speaker verification on a fixed password phrase which is common to all speakers. There are two such phrases in POLYCOST: SEN01 with text "Joe took father's green shoe bench out" and SEN02 with text "He eats several light tacos". This baseline experiment shall be done on the first phrase, SEN01, only.

A client model for speaker X shall be built from the first 2 sessions for that speaker, namely from files X/0{1,2}/SEN01. One verification test shall then be performed on each of the remaining recordings, namely on files X/{05,...}/SEN01. Models may not be adapted to test files.

To simulate impostor attempts, the SEN01-file from session 05 from the impostor speakers shall be used.

The choice of training material for a non-client model in this experiment is not obvious. In the rather unlikely case of a system where all users have the same password phrase, training on SEN01 would be the natural choice. It is not realistic in the case of user-individual password phrases, however, since a non-client model can not be trained for each existing password phrase in a system. The closest alternative would perhaps be to build a non-client model from sub-word components. This is done for instance in [4] where Parthasarathy & Rosenberg conclude that it is important that the non-client model captures the text contents of the spoken utterance. Choosing SEN01 for non-client model training can be seen as the ideal case of such a synthesized non-client model. This approach has therefore been chosen for this baseline experiment.

An alternative approach, which has been abandoned, is to train a non-client model from the MOT02-item. This is more realistic in the sense that the password sentence is not represented in the training material. However, it is not realistic to train phoneme models on this small off-line data and the performance of a global fully text-independent non-client model can be questioned. The speech in MOT02 is also in mother tongue while SEN01 is in English.

3.3 Experiment 2: Digit-prompted speaker verification on DIG-files

The task in this experiment is speaker verification on a sequence of digits which was not represented in the enrollment material. Hence, it is a simulation of a verification system where a sequence of digits is prompted to the client in the moment of the test. Prompting in this case is done by means of text display as opposed to audio prompting.

Each session contains recordings of five ten-digit sequences, shown in Table 2. In this experiment, two sequences taken from the first two sessions shall be used when building a speaker model for speaker X, namely files X/01/DIG01 and X/02/DIG02. This gives two occurrences of each digit for enrollment.

One verification test shall then be done on sequence 5 in each of the sessions 05 and later, namely on files X/{05,...}/DIG05. Models may not be adapted to test files.

To simulate impostor attempts, the DIG05-file from session 05 from the impostor speakers shall be used, and a non-client model, if used, shall be trained from files DIG0{1,2,3,4} in sessions 01 to 05 for the off-line speakers.


Item contents
DIG01 0 1 2 3 4 5 6 7 8 9
DIG02 8 3 9 4 6 1 7 2 0 5
DIG03 5 0 6 9 2 8 1 3 7 4
DIG04 9 8 7 6 5 4 3 2 1 0
DIG05 1 0 2 9 3 8 4 7 5 6
Table 2. Pre-scribed contents of the DIG-items.

3.3.1 The use of annotation files

In this experiment, models for individual digits should be trained from sequences of digits. In a real situation the recognition system would of course have to produce segmentations on its own (if segmentations are required). In the experiment defined here, however, the goal is to test the speaker verification part of the system. Therefore, we make the assumption that before the enrollment is started the speech has been put through an "ideal" digit segmenter. With this approach a system under test do not need to have the segmenter component. We also eliminate the influence on the built model from differences in segmenting modules. To simulate the ideal segmentation component, the manually verified annotation files provided with POLYCOST (version 2) are to be used.

During the test phase, on the other hand, segmentation can usually be done implicitly as part of the decoder operation in the speaker verification module. Hence the choice of using segmentation information in the enrollment but not in the test phase. This strategy was used for instance within the CAVE project for experiments on YOHO and SESP databases [5].

3.4 Experiment 3: Text-independent speaker verification on MOT-files

The task in this experiment is speaker verification in a text-independent manner on text spoken in the speaker's mother tongue.

A speaker model for speaker X shall be built from the unconstrained speech item from the two first sessions, namely files X/0{1,2}/MOT02. Each of these items contain up to 20 seconds of free speech. One verification test shall then be done on the somewhat constrained speech item in each of the sessions 05 and later, namely on files X/{05,...}/MOT01. Models may not be adapted to test files. To simulate impostor attempts, the MOT01-file from session 05 from the impostor speakers shall be used. A non-client model, if used, shall be trained from the MOT02-files.

For item MOT01 subjects were asked to speak their name, christen name, sex (female/male), town, country and mother tongue. This constraint means that subjects will say roughly the same thing in each test, which normalizes test utterances on text contents. The task is still text-independent since enrollment is made on unrelated text and models are not updated.

The knowledge of what the subject is saying in the test files may not be used a priori in the test.

3.5 Experiment 4: Text-independent speaker identification on MOT-files

This experiment is defined in all applicable aspects exactly the same as experiment 3, but with the task of closed-set speaker identification. Hence, speaker model X shall be built on files X/0{1,2}/MOT02 and each of files X/{05,...}/MOT01 shall be used for speaker identification tests. Adaptation on test utterances is not allowed. All speakers in the database shall be registered as clients and, thus, the task is closed-set speaker identification and there is no need for impostor tests.


BE Task Speech Non-client model Enrollment Tests (FR) Tests (FA)
1 ver fixed sentence 0{1-5}/SEN01 * 0{1,2}/SEN01 {05,...}/SEN01 05/SEN01
2 ver prompted digits 0{1-5}/DIG0{1-4} * 01/DIG01,02/DIG02 * {05,...}/DIG05 05/DIG05
3 ver free, mother tongue 0{1-5}/MOT02 * 0{1,2}/MOT02 {05,...}/MOT01 05/MOT01
4 id free, mother tongue 0{1-5}/MOT02 * 0{1,2}/MOT02 {05,...}/MOT01 -
Table 3. Summary of test conditions for the four baseline experiments.
*) annotation information may be used.


4. Implementation

The implementation of the baseline experiments as defined in this document is in principle left to the user. However, to avoid misinterpretation of the above specifications, and to help the user in implementing the specifications, sets of experiment specification files are provided. The experiment specification files contain explicit lists of files to use in enrollment and access test operations, and are provided for each speaker verification experiment (currently not for the speaker identification experiment, BE4). For each experiment (BE1-3), four files are provided as listed in Table 4: the first two files list sample files to use when enrolling speakers and when running access tests, and the remaining two files list sample files to use when training a non-client model.

The format of the experiment specification files is described in section 4.1. It is the same format used by The COST250 Speaker Recognition Reference System [12]. Identical experiment specification files are provided as part of this reference system.

All experiment specification files listed in Table 4 can also be downloaded in one package (unix-type experiment.tar.gz or WinZip-type experiment.zip).


file description
es.exp (BE1, 2, 3) lists files to use in each enrollment operation
ts.exp (BE1, 2, 3) lists files to use in each access test operation
os_wld.exp (BE1, 2, 3) lists files to use when training a single "world" non-client model
os_gen.exp (BE1, 2, 3) lists files to use when training a gender-dependent non-client model. This file lists exactly the same files as os_wld.exp, but male and female speakers are listed separately.
Table 4. Experiment specification files provided for each speaker verification experiment.

4.1 Experiment specification file format

The purpose of an experiment specification file is to define a set of operations to perform. An operation is either an enrollment operation or a verification operation (in the future, identification operations will also be included). One operation is defined on exactly one line in the file.

An enrollment operation involves training a speaker model for a certain identity from a set of files. A line that defines an enrollment operation has the following format:

     enroll identity file1 ... fileN 
A verification operation involves a speaker claiming an identity, using a set of files to support the claim. A line that defines a verification operation has the following format:
     speaker identity file1 ... fileN 
speaker and identity are strings like M010 or F031 where M indicates a male and F a female speaker. filei is a file tag rather than a complete file name. To synthesize a complete file name, the file tag must be prefixed by the name of the database's base directory and suffixed by a file name extension. For POLYCOST, the file tag has the format
     speaker/session/filename
where filename is a string like DIG05.

The case of letters in the directory names and file tags may or may not correspond to case in file names in your file system. This depends on your operating system, especially if you copied sample files directly from CD-ROM. This should not cause any ambiguity problems with POLYCOST, but may cause practical problems.

Empty lines and lines beginning with a hash mark (#) shall be ignored. They are used for comments in the file and to make the file more readable for a human reader.

4.2 List files

Three pairs of supplementary list files are also provided as shown in Table 5. They list client, impostor and off-line speakers included on each of the POLYCOST CD-ROMs. Files are ordinary text files with one speaker per line.


file speaker set
be_cli1.lst / be_cli2.lst client speakers on disc 1 and 2
be_imp1.lst / be_imp2.lst impostor speakers on disc 1 and 2
be_off1.lst / be_off2.lst off-line speakers on disc 1 and 2
Table 5. List files which defines three different speaker sets for the baseline experiments. Disk numbers refers to v 1.0 distribution CD-ROM disks.


5. Scoring

In order to be able to compare different experiments done on the POLYCOST database, all results must be presented in a similar way. Otherwise, two quite comparable experiments might end up with incomparable results due to the way the error rates are calculated and the results are presented.

5.1 Equal Error Rate

To provide a simple means for a first-order comparison of results, a single equal error rate (EER) measure is defined here. This measure is denoted a test set EER, and is based on a speaker-independent a posteriori threshold. This measure is defined in the "EAGLES" handbook [10] (section 11.4) as the EER based on a test set Receiver Operating Characteristic (ROC) curve. Note that the definition of the test set EER in the EAGLES book makes no assumptions or restrictions on the mix of same-sex and cross-sex impostor attempts. In these guidelines only same-sex attempts are considered, however, and the suggested measure is therefore effectively a same-sex EER measure.

With a speaker-independent threshold, many of the equations in [10] can be simplified. From equation (11.37), the test set false rejection rate is calculated as

,
(5.1)     
where theta is the threshold, c is the total number of genuine test utterances, zk is the score value resulting from test utterance k, and H(t) is a step function: .
(5.2)     
Similarly, from equation (11.51) in [10], the test set false acceptance rate is calculated as ,
(5.3)     
where d is the total number of impostor trials, and zk is the score value resulting from the k'th trial. Note that a trial involves the combination of a test utterance and a claimed identity.

The test set EER is generally defined as the error rate at the point where the test set false rejection rate and the test set false acceptance rate are equal,

.
(5.4)     
Since alpha and beta are sampled only at discrete values of theta, namely those score values occurring in the test series, the above definition of EER is not fully unambiguous. For the purpose of these guidelines, we therefore make the following definition of EER: .
(5.5)     
If there is more than one value for theta that satisfies the first part of (5.5), the smallest value shall be chosen. Equations (5.5) prescribe that the EER is the average of the false rejection and false acceptance rates at the point where the distance between the two is minimal.

5.2 DET-curve

To show the trade-off between false rejection and false acceptance rates, a DET-plot is recommended [11]. More specifically, this DET-plot should visualize the test set ROC. Combining equations (5.1) and (5.3) above, the test set ROC is defined as .
(5.6)     

5.3 Implementation

A piece of software that computes a test set ROC and the test set EER as defined in equations (5.6) and (5.5) is provided with the COST250 Speaker Recognition Reference System [12]. It can be used stand-alone, however, to score results from any speaker verification system.

Software for plotting a DET-curve from the ROC data can be retrieved from the NIST web site.


6. Conclusions

A common ground for experiments on the POLYCOST database has been established through the definition of a set of four baseline experiments and procedures for computing and presenting results for tests. The purpose of the guidelines is to standardize testing on this database and thus enable comparison between experiments made in different test sites.


References

[1] Nordström T., Melin H., Lindberg J., (1998), "A Comparative Study of Speaker Verification Systems using the Polycost Database", ICSLP'98, Sydney, Australia, Nov. 30 - Dec. 4, pp. 1359-1362.
(abstract) (paper - pdf, 4 pages)

[2] Melin H., Koolwaaij J.W., Lindberg J., Bimbot F. (1998), "A Comparative Evaluation of Variance Flooring Techniques in HMM-based Speaker Verification", ICSLP'98, Sydney, Australia, Nov. 30 - Dec. 4, pp. 1903-1906.
(abstract) (paper - pdf, 4 pages)

[3] Melin H., Lindberg J. (1999), "Variance Flooring, Scaling and Tying for Text-Dependent Speaker Verification", EUROSPEECH'99, Budapest, Hungary, September 5-9, pp. 1975-1978.
(abstract) (paper - pdf, 4 pages)

[4] Parthasarathy S., Rosenberg A.E. (1996). "General Phrase Speaker Verification Using Sub-word Background Models and Likelihood-ratio Scoring", ICSLP-96, Philadelphia, USA, pp. 2403-2406.
(paper - pdf, 4 pages)

[5] Bimbot F., Hutter H.P., Jaboulet C., Koolwaaij J., Lindberg J., Pierrot J.B., (1998). "An Overview of The CAVE Project Research Activities in Speaker Verification", RLA2C, Avignon, France, April 20-23, pp. 215-220.
(paper - ps, 6 pages)

[6] Petrovska D., Hennebert J., Melin H., Genoud D., (1998). "POLYCOST: a Telephone-Speech Database for Speaker Recognition", RLA2C, Avignon, France, April 20-23, pp. 211-214.
(abstract) (paper - pdf, 4 pages)

[7] Falcone M. (1999). "COST250 Working Group 4: Speaker Recognition Assessment and Dissemination", In: COST250 Final Report.

[8] Olsen J., Lindberg B. (1999). "Algorithms & Parameters for Speaker Recognition: Activities in COST250 Working Group 3", In: COST250 Final Report.

[9] Melin H. (1999). "Databases for Speaker Recognition: Activities in COST250 Working Group 2", In: COST250 Final Report.

[10] Bimbot F., Chollet G. (1997). "Assessment of speaker verification systems", In: Handbook of Standards and Resources for Spoken Language Systems, Gibbon D., Moore R., Winski R. (Eds.), Mouton de Gruyter, ISBN 3-11-015366-1.

[11] Martin A., Doddington G., Kamm T., Ordowski M., Przybocki M. (1997). "The DET Curve in Assessment of Detection Task Performance", Proc. Eurospeech-97, Rhodes, Greece, September, pp. 1895-1898.

[12] Melin H., Ariyaeeinia A., Falcone M. (1999). "The COST250 Speaker Recognition Reference System", In: COST250 Final Report.


Appendix A: Revision history

Version 2.0, 20/12/1999
Changes from version 2.0b4: Version 2.0b4, 30/11/1999
Changes from version 2.0b3: Version 2.0b3, 8/11/1999
Changes from version 2.0b2: Version 2.0b2, 25/10/1999
Changes from version 2.0b1: Version 2.0b1, 24/06/1999
Changes from version 1.0:

Version 1.01, 14/10/1997
Changes from version 1.0:

Version 1.0, 8/1/1997
Changes from version 1.0b1:

Version 1.0b1, 25/11/1996


Håkan Melin, Johan Lindberg
Centre for Speech Technology (CTT),
Department of Speech, Music and Hearing (TMH),
KTH, Stockholm