22.4SDMar 16
Cepstral Smoothing of Binary Masks for Convolutive Blind Separation of Speech MixturesIbrahim Missaoui, Zied Lachiri
In this paper, we propose a novel separation system for extracting two speech signals from two microphone recordings. Our system combines the blind source separation technique with cepstral smoothing of binary time-frequency masks. The last is composed of two steps. First, the two binary masks are estimated from the separated output signals of BSS algorithm. In the second step, a cepstral smoothing is applied of these spectral masks in order to reduce musical noise typically produced by time-frequency masking. Experiments were carried out with both artificially mixed speech signals using simulated room model and two real recordings. The evaluation results are promising and have shown the effectiveness of our system.
LGJan 23, 2019
Deep Clustering with a Dynamic Autoencoder: From Reconstruction towards Centroids ConstructionNairouz Mrabah, Naimul Mefraz Khan, Riadh Ksantini et al.
In unsupervised learning, there is no apparent straightforward cost function that can capture the significant factors of variations and similarities. Since natural systems have smooth dynamics, an opportunity is lost if an unsupervised objective function remains static during the training process. The absence of concrete supervision suggests that smooth dynamics should be integrated. Compared to classical static cost functions, dynamic objective functions allow to better make use of the gradual and uncertain knowledge acquired through pseudo-supervision. In this paper, we propose Dynamic Autoencoder (DynAE), a novel model for deep clustering that overcomes a clustering-reconstruction trade-off, by gradually and smoothly eliminating the reconstruction objective function in favor of a construction one. Experimental evaluations on benchmark datasets show that our approach achieves state-of-the-art results compared to the most relevant deep clustering methods.
SDOct 14, 2012
Blind speech separation based on undecimated wavelet packet-perceptual filterbanks and independent component analysisIbrahim Missaoui, Zied Lachiri
In this paper, we address the problem of blind separation of speech mixtures. We propose a new blind speech separation system, which integrates a perceptual filterbank and independent component analysis (ICA) and using kurtosis criterion. The perceptual filterbank was designed by adjusting undecimated wavelet packet decomposition (UWPD) tree in order to accord to critical band characteristics of psycho-acoustic model. Our proposed technique consists on transforming the observations signals into an adequate representation using UWPD and Kurtosis maximization criterion in a new preprocessing step in order to increase the non-Gaussianity which is a pre-requirement for ICA. Experiments were carried out with the instantaneous mixture of two speech sources using two sensors. The obtained results show that the proposed method gives a considerable improvement when compared with FastICA and other techniques.
CVSep 25, 2012
Environmental Sounds Spectrogram Classification using Log-Gabor Filters and Multiclass Support Vector MachinesSameh Souli, Zied Lachiri
This paper presents novel approaches for efficient feature extraction using environmental sound magnitude spectrogram. We propose approach based on the visual domain. This approach included three methods. The first method is based on extraction for each spectrogram a single log-Gabor filter followed by mutual information procedure. In the second method, the spectrogram is passed by the same steps of the first method but with an averaged bank of 12 log-Gabor filter. The third method consists of spectrogram segmentation into three patches, and after that for each spectrogram patch we applied the second method. The classification results prove that the second method is the most efficient in our environmental sound classification system.