R. Coelho

AS
5papers
26citations
Novelty50%
AI Score23

5 Papers

ASDec 9, 2021
Harmonic and non-Harmonic Based Noisy Reverberant Speech Enhancement in Time Domain

G. Zucatelli, R. Coelho

This paper introduces the single step time domain method named HnH-NRSE, whihc is designed for simultaneous speech intelligibility and quality improvement under noisy-reverberant conditions. In this solution, harmonic and non-harmonic elements of speech are separated by applying zero-crossing and energy criteria. An objective evaluation of the its non-stationarity degree is further used for an adaptive gain to treat masking components. No prior knowledge of speech statistics or room information is required for this technique. Additionally, two combined solutions, IRMO and IRMN, are proposed as composite methods for improvement on noisy-reverberant speech signals. The proposed and baseline methods are evaluated considering two intelligibility and three quality measures, applied for the objective prediction. The results show that the proposed scheme leads to a higher intelligibility and quality improvement when compared to competing methods in most scenarios. Additionally, a perceptual intelligibility listening test is performed, which corroborates with these results. Furthermore, the proposed HnH-NRSE solution attains SRMR quality measure with similar results when compared to the composed IRMO and IRMN techniques.

ASAug 20, 2020
Blind Mask to Improve Intelligibility of Non-Stationary Noisy Speech

F. Farias, R. Coelho

This letter proposes a novel blind acoustic mask (BAM) designed to adaptively detect noise components and preserve target speech segments in time-domain. A robust standard deviation estimator is applied to the non-stationary noisy speech to identify noise masking elements. The main contribution of the proposed solution is the use of this noise statistics to derive an adaptive information to define and select samples with lower noise proportion. Thus, preserving speech intelligibility. Additionally, no information of the target speech and noise signals statistics is previously required to this non-ideal mask. The BAM and three competitive methods, Ideal Binary Mask (IBM), Target Binary Mask (TBM), and Non-stationary Noise Estimation for Speech Enhancement (NNESE), are evaluated considering speech signals corrupted by three non-stationary acoustic noises and six values of signal-to-noise ratio (SNR). Results demonstrate that the BAM technique achieves intelligibility gains comparable to ideal masks while maintaining good speech quality.

ASOct 7, 2019
Adaptive Reverberation Absorption using Non-stationary Masking Components Detection for Intelligibility Improvement

G. Zucatelli, R. Coelho

This letter proposes a new time domain absorption approach designed to reduce masking components of speech signals under noisy-reverberant conditions. In this method, the non-stationarity of corrupted signal segments is used to detect masking distortions based on a defined threshold. The nonstationarity is objectively measured and is also adopted to determine the absorption procedure. Additionally, no prior knowledge of speech statistics or of the room information is required for this technique. Three intelligibility measures (ESII, ASIIST, SRMRnorm) and a perceptual listening test are used for evaluation. The experiments results show that the proposed scheme leads to a higher intelligibility improvement when compared to competing methods.

ASOct 7, 2019
Impulsive Noise Detection for Intelligibility and Quality Improvement of Speech Enhancement Methods Applied in Time-Domain

C. Medina, R. Coelho

This letter introduces a novel speech enhancement method in the Hilbert-Huang Transform domain to mitigate the effects of acoustic impulsive noises. The estimation and selection of noise components is based on the impulsiveness index of decomposition modes. Speech enhancement experiments are conducted considering five acoustic noises with different impulsiveness index and non-stationarity degrees under various signal-to-noise ratios. Three speech enhancement algorithms are adopted as baseline in the evaluation analysis considering spectral and time domains. The proposed solution achieves the best results in terms of objective quality measures and similar speech intelligibility rates to the competitive methods.

ASOct 7, 2019
Effective Acoustic Energy Sensing Exploitation for Target Sources Localization in Urban Acoustic Scenes

M. Alves, R. Coelho, E. Dranka

This letter proposes a new approach to improve the accuracy of the Energy-based source localization methods in urban acoustic scenes. The proposed acoustic energy sensing flow estimation (ESFE) uses the sensors signal nonstationarity degree to determine the area with highest energy concentration in the scenes. The ESFE is applied to different acoustic scenes and yields to source localization accuracy improvement with computational complexity reduction. The experiments results show that the proposed scheme leads to significant improvement in source localization accuracy.