Higgins Annotation Tool
Copyright © 2009-2011, Gabriel Skantze.
The Higgins Annotation Tool can be used to transcribe and annotate speech with one or more audio tracks (such as dialogue).
For each audio track, a number of audio segments are defined. Each audio segment can then be transcribed. Within each transcription, text segments (such as syntactic phrases) may be defined. A set of feature-value pairs may then be annotated for the tracks, audio segments and text segments.
The annotation is saved in an XML format (schema). To start a new annotation, you can either open a wav-file (in stereo or mono) directly in the tool (choosing "New..." form the File menu), or precompile the annotation XML file and open it. To open and annotate more than one audio file simultaneously, you must choose the latter option.
Binaries for Windows (a single .exe file):
Download Higgins Annotation Tool (updated 2011-03-10)
||Play selection, current track
||Play selection/segment, all tracks
|Play selection/segment from 10%, 20% ... 90%, current track.
|Play selection/segment from 10%, 20% ... 90%, all tracks.
||Create audio segment from waveform selection
||Create text segment from text selection
||Move to next audio segment
||Move to previous audio segment
||Move to next audio segment in the same track
||Move to previous audio segment in the same track
||Move to next non-segment
||Move to previous non-segment
||Move cursor from feature annotation to transcription
Automatically segmenting audio
As part of the IrisTK dialogue system framework, there is a tool for automatically creating the audio segments, using a voice activity detector. If you install IrisTK, you should be able to run (from the command line):
iristk makehat -i speaker1.wav speaker2.wav -o annotation.xml -e 20
You may have to change the energy threshold (20), depending on the volume used when recording. You can also change the end silence threshold (to for example 1000 milliseconds) by adding "-s 1000". All options are listed if you just run "iristk makehat".
You can also do automatic transcription using the Nuance cloud-based recognizer. To do this, you first have to register for a free developer account. Then insert your app-id and app-key in the IrisTK/addon/Dragon folder, according to the instructions in the readme file located there. If everything is done correctly, you should be able to do recognition by adding "-r en-us" to the makehat command.
If you have any questions regarding the tool, please contact Gabriel Skantze: