Complex and Quaternionic Principal Component Pursuit and Its Application to Audio Separation
This work addresses a domain-specific problem in audio signal processing by providing a more accurate method for source separation, though it is incremental as it builds on existing principal component pursuit frameworks.
The authors tackled the lack of phase information in existing real-valued principal component pursuit methods by extending it to complex and quaternionic cases, and applied this to singing voice separation, showing improved results on datasets like iKala and MSD100.
Recently, the principal component pursuit has received increasing attention in signal processing research ranging from source separation to video surveillance. So far, all existing formulations are real-valued and lack the concept of phase, which is inherent in inputs such as complex spectrograms or color images. Thus, in this letter, we extend principal component pursuit to the complex and quaternionic cases to account for the missing phase information. Specifically, we present both complex and quaternionic proximity operators for the $\ell_1$- and trace-norm regularizers. These operators can be used in conjunction with proximal minimization methods such as the inexact augmented Lagrange multiplier algorithm. The new algorithms are then applied to the singing voice separation problem, which aims to separate the singing voice from the instrumental accompaniment. Results on the iKala and MSD100 datasets confirmed the usefulness of phase information in principal component pursuit.