ASApr 17, 2018Code
The 2018 Signal Separation Evaluation CampaignFabian-Robert Stöter, Antoine Liutkus, Nobutaka Ito
This paper reports the organization and results for the 2018 community-based Signal Separation Evaluation Campaign (SiSEC 2018). This year's edition was focused on audio and pursued the effort towards scaling up and making it easier to prototype audio separation software in an era of machine-learning based systems. For this purpose, we prepared a new music separation database: MUSDB18, featuring close to 10h of audio. Additionally, open-source software was released to automatically load, process and report performance on MUSDB18. Furthermore, a new official Python version for the BSSEval toolbox was released, along with reference implementations for three oracle separation methods: ideal binary mask, ideal ratio mask, and multichannel Wiener filter. We finally report the results obtained by the participants.
SDJan 21, 2021
A Joint Diagonalization Based Efficient Approach to Underdetermined Blind Audio Source Separation Using the Multichannel Wiener FilterNobutaka Ito, Rintaro Ikeshita, Hiroshi Sawada et al.
This paper presents a computationally efficient approach to blind source separation (BSS) of audio signals, applicable even when there are more sources than microphones (i.e., the underdetermined case). When there are as many sources as microphones (i.e., the determined case), BSS can be performed computationally efficiently by independent component analysis (ICA). Unfortunately, however, ICA is basically inapplicable to the underdetermined case. Another BSS approach using the multichannel Wiener filter (MWF) is applicable even to this case, and encompasses full-rank spatial covariance analysis (FCA) and multichannel non-negative matrix factorization (MNMF). However, these methods require massive numbers of matrix inversions to design the MWF, and are thus computationally inefficient. To overcome this drawback, we exploit the well-known property of diagonal matrices that matrix inversion amounts to mere inversion of the diagonal elements and can thus be performed computationally efficiently. This makes it possible to drastically reduce the computational cost of the above matrix inversions based on a joint diagonalization (JD) idea, leading to computationally efficient BSS. Specifically, we restrict the N spatial covariance matrices (SCMs) of all N sources to a class of (exactly) jointly diagonalizable matrices. Based on this approach, we present FastFCA, a computationally efficient extension of FCA. We also present a unified framework for underdetermined and determined audio BSS, which highlights a theoretical connection between FastFCA and other methods. Moreover, we reveal that FastFCA can be regarded as a regularized version of approximate joint diagonalization (AJD).
SDMay 24, 2018
FastFCA-AS: Joint Diagonalization Based Acceleration of Full-Rank Spatial Covariance Analysis for Separating Any Number of SourcesNobutaka Ito, Tomohiro Nakatani
Here we propose FastFCA-AS, an accelerated algorithm for Full-rank spatial Covariance Analysis (FCA), which is a robust audio source separation method proposed by Duong et al. ["Under-determined reverberant audio source separation using a full-rank spatial covariance model," IEEE Trans. ASLP, vol. 18, no. 7, pp. 1830-1840, Sept. 2010]. In the conventional FCA, matrix inversion and matrix multiplication are required at each time-frequency point in each iteration of an iterative parameter estimation algorithm. This causes a heavy computational load, thereby rendering the FCA infeasible in many applications. To overcome this drawback, we take a joint diagonalization approach, whereby matrix inversion and matrix multiplication are reduced to mere inversion and multiplication of diagonal entries. This makes the FastFCA-AS significantly faster than the FCA and even applicable to observed data of long duration or a situation with restricted computational resources. Although we have already proposed another acceleration of the FCA for two sources, the proposed FastFCA-AS is applicable to an arbitrary number of sources. In an experiment with three sources and three microphones, the FastFCA-AS was over 420 times faster than the FCA with a slightly better source separation performance.
SDMay 17, 2018
FastFCA: A Joint Diagonalization Based Fast Algorithm for Audio Source Separation Using A Full-Rank Spatial Covariance ModelNobutaka Ito, Shoko Araki, Tomohiro Nakatani
A source separation method using a full-rank spatial covariance model has been proposed by Duong et al. ["Under-determined Reverberant Audio Source Separation Using a Full-rank Spatial Covariance Model," IEEE Trans. ASLP, vol. 18, no. 7, pp. 1830-1840, Sep. 2010], which is referred to as full-rank spatial covariance analysis (FCA) in this paper. Here we propose a fast algorithm for estimating the model parameters of the FCA, which is named Fast-FCA, and applicable to the two-source case. Though quite effective in source separation, the conventional FCA has a major drawback of expensive computation. Indeed, the conventional algorithm for estimating the model parameters of the FCA requires frame-wise matrix inversion and matrix multiplication. Therefore, the conventional FCA may be infeasible in applications with restricted computational resources. In contrast, the proposed FastFCA bypasses matrix inversion and matrix multiplication owing to joint diagonalization based on the generalized eigenvalue problem. Furthermore, the FastFCA is strictly equivalent to the conventional algorithm. An experiment has shown that the FastFCA was over 250 times faster than the conventional algorithm with virtually the same source separation performance.