Philippe Lalitte

Audio Descriptors: New Computer Tools for Stream and Segmentation Analysis

Audio descriptors result from the extraction of some properties by signal reduction. Originally designed for speech and timbre analysis in psychoacoustics, they are particularly used in the MIR (Music Information Retrieval) and the music perception domains. Some descriptors model low-level processing (e.g., intensity, timbre brightness, noise level, roughness, etc.), others are related to cognitive processing (e.g., tonal strength or tonal clarity, harmonic changes, temporal segmentation, novelty profile, etc.). Several software, easily available such as Lucerne Audio Recording Analyzer (LARA), MIR Toolbox, Psysound or Sonic Visualiser, have implemented audio descriptors.
This new approach of computer-assisted music analysis, called sub-symbolic, allows to observe, to measure and to analyse sound phenomena from audio recordings. Recordings provide access to several information that are inaccessible with the symbolic approach (which recode the score with various types of encodings), including those from performance (tempo, pitch and duration micro-variations, timbre, texture, acoustic space...) that may be critical for the perception of continuities and discretizations. This contribution aims to examine the relevance of some audio descriptors for stream and segmentation analysis that lead to a cognitive representation of the shape and structure. Firstly, I will propose a classification of audio descriptors according to their functionality for analysis, and secondly, I will sketch a reliable methodology for using these computer tools.