Letter to invited experts

CLARIN and FLaReNet workshop at KTH Stockholm 25 and 26 Nov 2009

Best practices for speech and multimodal databases

Short about CLARIN

The CLARIN project is a large-scale pan-European collaborative effort to create, coordinate and make language resources and technology available and readily usable. CLARIN offers scholars the tools to allow computer-aided language processing, addressing one or more of the multiple roles language plays (i.e. carrier of cultural content and knowledge, instrument of communication, component of identity and object of study) in the Humanities and Social Sciences. More info at

Short about FLaReNet

A major condition for the take-off of the field of Language Resources and Language Technologies is the creation of a shared policy for the next years. FLaReNet aims at developing a common vision of the area and fostering a European strategy for consolidating the sector, thus enhancing competitiveness at EU level and worldwide. By creating a consensus among major players in the field, the mission of FLaReNet is to identify priorities as well as short, medium, and long-term strategic objectives and provide consensual recommendations in the form of a plan of action for EC, national organisations and industry. More info at

Background for the workshop

The current advances and rapid development of human computer interfaces particularly have to a large degree benefited from successful collection and collaborative use of large-scale speech and multimodal databases. To further improve these interfaces and make them more intelligent and human-like, a major challenge is to understand human interaction and communication. To this end, not only new databases covering more complex interaction and unexplored areas are needed but also new efficient means of exploiting these databases and facilitating joint use by research groups engaged in a wide variety of disciplines.

Our research community is large and varied, and there is a great need to define the requirements for infrastructures which can be fruitfully and easily shared. The annotation of interesting corpora in this field is, however, divergent and would benefit from finding ways for harmonization, interoperability etc. Thus, we feel that it is of importance to look into standards and best practices to encode various relevant features of these corpora to facilitate their use for a wide diversity of researchers. The features may be intonation, facial expressions, gestures, turn-taking, emotions, etc.

When we look at speech and multimodal research topics in the humanities and social sciences such as child-directed speech, accent shifts, dialects and sociolects, dialogs in different environments and second language learners, we do not have a strong tradition for collection and distribution of relevant databases. Currently it is very much the researchers' own task to collect the data or try to do research on a corpus collected for some other purpose. To be able to easily locate, access and share data oriented to the humanities and social sciences would be a big step for many researchers. Thus an important goal towards increasing our understanding of human communication and making our applications more intelligent is to make language resources and technology available and readily usable for all kinds of scholars. Common standards are essential for achieving this.