Bridging the semantic gap in Music Information Retrieval: Modelling perceptual-based features in music audio
This project aims at deveoping new computer tools for characterizing, and indexing music audio.
Today, computers and the Internet are commonly used for all aspects of music culture from production to listening. When a typical
online music database contains 5 million songs, new computer tools are needed for characterizing, and indexing music audio. This
is the focus of the new research field Music Information Retrieval (MIR). In an earlier project, music analysis tools were developed
that were based on symbolic input, i.e. notes. In the proposed project, we will extend the previous tools but using audio input instead.
This poses additional challenges in terms of basic audio feature extraction. Therefore, a completely new and different approach is
suggested. The idea is to identify and model new descriptions of music corresponding to overall musical features such as rhythmic
clarity, harmonic complexity or emotional character, rather than relying on traditional music analysis in terms of notes, rhythm and
harmony. These features will be modeled by combining low-level features using data-mining techniques based on perceptual
criteria. The complete analysis will use a new intermediate layer of features, resulting in a four-layered approach with audio,
low-level features, mid-level perceptual features, and semantic descriptions. A successful result will bridge the gap between earlier
studies in music psychology and contemporary data-mining projects within the MIR field, potentially leading to new ways of
understanding and modeling music audio.
Group: Sound and Music Computing
Anders Friberg (Project leader)
Funding: VR (621-2009-4285)
Duration: 2010 - 2013
Keywords: music analysis, feature extraction, perception, music information retrieval
International Computer Music Conference (ICMC), Shanghai, China. [abstract] [pdf] (2017). Cross-cultural aspects of perceptual features in K-Pop: A pilot study comparing Chinese and Swedish listeners. In
Journal of the Acoustical Society of America, 136(4), 1951-1963. [pdf] (2014). Using listener-based perceptual features as intermediate representations in music information retrieval.
arXiv:1403.7923 [cs.IR]. [abstract] [link] (2014). Using perceptually defined music features in music information retrieval.
Poster presented at the 12th International Conference on Music Perception and Cognition and the 8th Triennial Conference of the European Society for the Cognitive Sciences of Music. [pdf] (2012). Music Listening from an Ecological Perspective. In
Cortex, 47(9), 1068-1081. [link] (2011). Emotion rendering in music: Range and characteristic values of seven musical variables.
Journal of the Acoustical Society of America - Express Letters, EL193-EL199. [abstract] [pdf] (2011). Influence of pitch, loudness, and timbre on the perception of instrument dynamics.
8th Sound and Music Computing Conference, Padova, Italy. [pdf] (2011). A Comparison of Perceptual Ratings and Computed Audio Features. In
Perceptual ratings of musical parameters. Mainz: Schott 2011, (Klang und Begriff 4). [pdf] (2011).
Evaluation of Musical Feature Extraction Tools using Perceptual Ratings. Master's thesis. [pdf] (2011).
Front. Hum. Neurosci. Conference Abstract: Tuning the Brain for Music. Helsinki, Finland. (2009). How fast is the tempo in a happy music performance?. In
Front. Hum. Neurosci. Conference Abstract: Tuning the Brain for Music. Helsinki, Finland. [link] (2009). Enabling emotional expression and interaction with new expressive interfaces. In
Proc. of the 11th Int. Conference on Digital Audio Effects (DAFx-08), Espoo, Finland (pp. 1-6). [pdf]
(2008). . In