Hybrid ML for Music Analysis

Hybrid machine-learning (ML) approaches that combine deep-learning with model based approaches promises the "best of both worlds." While some methods can be combined in a common framework, e.g. mean-field variational message passing and variational autoencoders [1], realizing such a hybrid methods is not trivial. Challenges arise e.g. due to the computational complexity of model-based algorithms which slows down the training of the ML part. Thus, the aim of this thesis is to investigate hybrid inference methods that combine deep-learning with model-based approaches, e.g. in the context of multi-pitch estimation [2] where the signals from tonal instruments can be well-modeled as a periodic signal (e.g. using Fourier series) but non-tonal instruments like a drum kit or other percussive instruments cannot be modeled in the same way.

Your Tasks

Literature research on hybrid ML methods.
Discuss/propose an architecture of a "neural enhanced" multi-pitch detection algorithm.
Implement and train the proposed algorithm/architecture.
Evaluate the performance and compare it against other methods.
Write your thesis.

Your Profile

Familiar with (statistical) signal processing.
Experience with training of ML-methods (particularly variational autoencoders) is beneficial.
Experience with Tensorflow, PyTorch or similar ML-toolchains is also beneficial.

References

M. J. Johnson, D. K. Duvenaud, A. Wiltschko, R. P. Adams, and S. R. Datta. "Composing graphical models with neural networks for structured representations and fast inference," in Advances in neural information processing systems 29, 2016.
J. Möderl, F. Pernkopf, K. Witrisal, and E. Leitinger, "Variational Inference of Structured Line Spectra Exploiting Group-Sparsity." arXiv preprint, 2023, doi: 10.48550/arXiv.2303.03017

Contact

Jakob Möderl

Erik Leitinger (SPSC)