Matevž Pesek/Aleš Leonardis/Matija Marolt

Compositional Hierarchical Model for Pattern Discovery in Music

We present a biologically-inspired hierarchical model for music information retrieval. The model can be treated as a deep learning architecture, and poses an alternative to deep architectures based on neural networks. Its main features are generativeness and transparency that allow insights into the music concepts learned from a set of input signals. The model consists of multiple layers, each composed of a number of parts. The hierarchical nature of the model corresponds well with hierarchical structures in music.

If the model is learned on time-frequency representations of music signals, parts in lower layers correspond to low-level concepts (e.g. tone partials), while parts in higher layers combine lower-level representations into more complex concepts (tones, chords). The layers are unsupervisedly learned one-by-one. Parts in each layer are compositions of parts from previous layers based on statistics of co- occurrences of their activations as the driving force of the learning process. We show how the same principle of compositional hierarchical modeling can be used for event-based modeling by applying the statistically-driven unsupervised learning of time-domain patterns. Compositions are formed from event progressions in the music piece. Learning exposes frequently co-occurring progressions and binds them into more complex compositions on higher layers. We show how the time-domain model can be applied for symbolic data analysis, as well as with audio recordings.