Best practices for speech and multimodal databases
CLARIN and FLaReNet workshop at KTH Stockholm 25 and 26 Nov 2009
Our research community is large and varied, and there is a great need to
define the requirements for infrastructures which can be fruitfully and
easily shared. The annotation of interesting corpora in this field is,
however, divergent and would benefit from finding ways for
harmonization, interoperability etc. Thus, we feel that it is of
importance to look into standards and best practices to encode various
relevant features of these corpora to facilitate their use for a wide
diversity of researchers. The features may be intonation, facial
expressions, gestures, turn-taking, emotions, etc.
In addition we would like to have a general discussion about the best way
to get the speech community more actively involved in the issues discussed
within the CLARIN project. The speech researchers have already been breaking
new ground for spoken data collection and distribution under the LDC and ELRA
umbrellas. A driving force has been the need for large databases for automatic
speech recognition training. However, the problem to facilitate efficient and
robust access to a multitude of spoken corpora for general research within the
humanities is far from solved.
We would like to hear about your experiences, needs and potential ideas for
standardization within the subject of the workshop.
Rolf Carlson, Kjell Elenius and David House