Music signal processing for automatic extraction of harmonic and rhythmic information

Khadkevich, Maksim (2011) Music signal processing for automatic extraction of harmonic and rhythmic information. PhD thesis, University of Trento, Fondazione Bruno Kessler.

PDF (Music signal processing for automatic extraction of harmonic and rhythmic information) - Doctoral Thesis
Available under License Creative Commons Attribution Non-commercial.



This thesis is concerned with the problem of automatic extraction of harmonic and rhythmic information from music audio signals using statistical framework and advanced signal processing methods. Among different research directions, automatic extraction of chords and key has always been of a great interest to Music Information Retrieval (MIR) community. Chord progressions and key information can serve as a robust mid-level representation for a variety of MIR tasks. We propose statistical approaches to automatic extraction of chord progressions using Hidden Markov Models (HMM) based framework. General ideas we rely on have already proved to be effective in speech recognition. We propose novel probabilistic approaches that include acoustic modeling layer and language modeling layer. We investigate the usage of standard N-grams and Factored Language Models (FLM) for automatic chord recognition. Another central topic of this work is the feature extraction techniques. We develop a set of new features that belong to chroma family. A set of novel chroma features that is based on the application of Pseudo-Quadrature Mirror Filter (PQMF) bank is introduced. We show the advantage of using Time-Frequency Reassignment (TFR) technique to derive better acoustic features. Tempo estimation and beat structure extraction are amongst the most challenging tasks in MIR community. We develop a novel method for beat/downbeat estimation from audio. It is based on the same statistical approach that consists of two hierarchical levels: acoustic modeling and beat sequence modeling. We propose the definition of a very specific beat duration model that exploits an HMM structure without self-transitions. A new feature set that utilizes the advantages of harmonic-impulsive component separation technique is introduced. The proposed methods are compared to numerous state-of-the-art approaches by participation in the MIREX competition, which is the best impartial assessment of MIR systems nowadays.

Item Type:Doctoral Thesis (PhD)
Doctoral School:Information and Communication Technology
Subjects:Area 01 - Scienze matematiche e informatiche > INF/01 INFORMATICA
Repository Staff approval on:20 Feb 2012 09:40

Repository Staff Only: item control page