Svensk version


Bridging the semantic gap in Music Information Retrieval: Modelling perceptual-based features in music audio

This project aims at deveoping new computer tools for characterizing, and indexing music audio.

Today, computers and the Internet are commonly used for all aspects of music culture from production to listening. When a typical online music database contains 5 million songs, new computer tools are needed for characterizing, and indexing music audio. This is the focus of the new research field Music Information Retrieval (MIR). In an earlier project, music analysis tools were developed that were based on symbolic input, i.e. notes. In the proposed project, we will extend the previous tools but using audio input instead. This poses additional challenges in terms of basic audio feature extraction. Therefore, a completely new and different approach is suggested. The idea is to identify and model new descriptions of music corresponding to overall musical features such as rhythmic clarity, harmonic complexity or emotional character, rather than relying on traditional music analysis in terms of notes, rhythm and harmony. These features will be modeled by combining low-level features using data-mining techniques based on perceptual criteria. The complete analysis will use a new intermediate layer of features, resulting in a four-layered approach with audio, low-level features, mid-level perceptual features, and semantic descriptions. A successful result will bridge the gap between earlier studies in music psychology and contemporary data-mining projects within the MIR field, potentially leading to new ways of understanding and modeling music audio.

Group: Sound and Music Computing

Anders Friberg (Project leader)
Erwin Schoonderwaldt

Funding: VR (621-2009-4285)

Duration: 2010 - 2013

Keywords: music analysis, feature extraction, perception, music information retrieval

Related publications:


, . (Ed.). (2017). Cross-cultural aspects of perceptual features in K-Pop: A pilot study comparing Chinese and Swedish listeners. In International Computer Music Conference (ICMC), Shanghai, China. [abstract] [pdf]


Friberg, A., Schoonderwaldt, E., Hedblad, A., Fabiani, M., & Elowsson, A. (2014). Using listener-based perceptual features as intermediate representations in music information retrieval. Journal of the Acoustical Society of America, 136(4), 1951-1963. [pdf]

Friberg, A., Schoonderwaldt, E., Hedblad, A., Fabiani, M., & Elowsson, A. (2014). Using perceptually defined music features in music information retrieval. arXiv:1403.7923 [cs.IR]. [abstract] [link]


Friberg, A. (2012). Music Listening from an Ecological Perspective. In Poster presented at the 12th International Conference on Music Perception and Cognition and the 8th Triennial Conference of the European Society for the Cognitive Sciences of Music. [pdf]


Bresin, R., & Friberg, A. (2011). Emotion rendering in music: Range and characteristic values of seven musical variables. Cortex, 47(9), 1068-1081. [link]

Fabiani, M., & Friberg, A. (2011). Influence of pitch, loudness, and timbre on the perception of instrument dynamics. Journal of the Acoustical Society of America - Express Letters, EL193-EL199. [abstract] [pdf]

Friberg, A., & Hedblad, A. (2011). A Comparison of Perceptual Ratings and Computed Audio Features. In 8th Sound and Music Computing Conference, Padova, Italy. [pdf]

Friberg, A., Schoonderwaldt, E., & Hedblad, A. (2011). Perceptual ratings of musical parameters. Mainz: Schott 2011, (Klang und Begriff 4). [pdf]

Hedblad, A. (2011). Evaluation of Musical Feature Extraction Tools using Perceptual Ratings. Master's thesis. [pdf]


Bresin, R., & Friberg, A. (2009). How fast is the tempo in a happy music performance?. In Front. Hum. Neurosci. Conference Abstract: Tuning the Brain for Music. Helsinki, Finland.

Friberg, A., Bresin, R., Hansen, K. F., & Fabiani, M. (2009). Enabling emotional expression and interaction with new expressive interfaces. In Front. Hum. Neurosci. Conference Abstract: Tuning the Brain for Music. Helsinki, Finland. [link]


Friberg, A. (2008). . In Proc. of the 11th Int. Conference on Digital Audio Effects (DAFx-08), Espoo, Finland (pp. 1-6). [pdf]

Published by: TMH, Speech, Music and Hearing

Last updated: 2012-11-09