Sarah Verhulst

AS
4papers
59citations
Novelty29%
AI Score18

4 Papers

ASJul 5, 2021
A comparative study of eight human auditory models of monaural processing

Alejandro Osses Vecchi, Léo Varnet, Laurel H. Carney et al.

A number of auditory models have been developed using diverging approaches, either physiological or perceptual, but they share comparable stages of signal processing, as they are inspired by the same constitutive parts of the auditory system. We compare eight monaural models that are openly accessible in the Auditory Modelling Toolbox. We discuss the considerations required to make the model outputs comparable to each other, as well as the results for the following model processing stages or their equivalents: Outer and middle ear, cochlear filter bank, inner hair cell, auditory nerve synapse, cochlear nucleus, and inferior colliculus. The discussion includes a list of recommendations for future applications of auditory models.

ASApr 30, 2020
A convolutional neural-network model of human cochlear mechanics and filter tuning for real-time applications

Deepak Baby, Arthur Van Den Broucke, Sarah Verhulst

Auditory models are commonly used as feature extractors for automatic speech-recognition systems or as front-ends for robotics, machine-hearing and hearing-aid applications. Although auditory models can capture the biophysical and nonlinear properties of human hearing in great detail, these biophysical models are computationally expensive and cannot be used in real-time applications. We present a hybrid approach where convolutional neural networks are combined with computational neuroscience to yield a real-time end-to-end model for human cochlear mechanics, including level-dependent filter tuning (CoNNear). The CoNNear model was trained on acoustic speech material and its performance and applicability were evaluated using (unseen) sound stimuli commonly employed in cochlear mechanics research. The CoNNear model accurately simulates human cochlear frequency selectivity and its dependence on sound intensity, an essential quality for robust speech intelligibility at negative speech-to-background-noise ratios. The CoNNear architecture is based on parallel and differentiable computations and has the power to achieve real-time human performance. These unique CoNNear features will enable the next generation of human-like machine-hearing applications.

ASDec 21, 2019
Calibration and reference simulations for the auditory periphery model of Verhulst et al. 2018 version 1.2

Alejandro Osses Vecchi, Sarah Verhulst

This document describes a comprehensive procedure of how the biophysical model published by Verhulst et al. (2018) can be calibrated on the basis of reference auditory brainstem responses. Additionally, the filter design used in two of the model stages, cochlear nucleus (CN) and inferior colliculus (IC), is described in detail. These descriptions are valid for a new release of the Verhulst et al. model, version 1.2, as well as for previous versions of the model (version 1.1 or earlier). The differences between the model versions are explicitly mentioned and simulations to basic auditory stimuli are shown for model versions 1.1 and 1.2. In short, version 1.2 of the model includes a new implementation of the CN and IC stages (Stages 5 and 6). All previous model stages (Stages 1-4: outer and middle ear, transmission-line cochlear filter bank, inner hair cell model, and auditory nerve model) remained unchanged. In the new release (model version 1.2), in addition to the updated CN and IC stages, we employed a different calibration procedure to match human reference ABR amplitudes of waves I, III, and V more faithfully. This release note shows the implications of these model adjustments on the simulations presented in the original 2018 model paper. For this purpose, results from two model versions are reported: (1) New model release (version 1.2), labelled as model v1.2; and (2) Previous model release as used by Verhulst et al., labelled as model v1.1. The main difference between IC model stages relates to the degree of IC inhibition that was applied, with more inhibition in v1.2 than implemented in v1.1. The time domain simulations presented in this document show that this change in inhibition strength does not drastically change the results presented in the original paper. However, v1.2 more correctly captures the physiologically derived CN and IC inhibition/excitation strengths.

SDJun 1, 2018
Machines hear better when they have ears

Deepak Baby, Sarah Verhulst

Deep-neural-network (DNN) based noise suppression systems yield significant improvements over conventional approaches such as spectral subtraction and non-negative matrix factorization, but do not generalize well to noise conditions they were not trained for. In comparison to DNNs, humans show remarkable noise suppression capabilities that yield successful speech intelligibility under various adverse listening conditions and negative signal-to-noise ratios (SNRs). Motivated by the excellent human performance, this paper explores whether numerical models that simulate human cochlear signal processing can be combined with DNNs to improve the robustness of DNN based noise suppression systems. Five cochlear models were coupled to fully-connected and recurrent NN-based noise suppression systems and were trained and evaluated for a variety of noise conditions using objective metrics: perceptual speech quality (PESQ), segmental SNR and cepstral distance. The simulations show that biophysically-inspired cochlear models improve the generalizability of DNN-based noise suppression systems for unseen noise and negative SNRs. This approach thus leads to robust noise suppression systems that are less sensitive to the noise type and noise level. Because cochlear models capture the intrinsic nonlinearities and dynamics of peripheral auditory processing, it is shown here that accounting for their deterministic signal processing improves machine hearing and avoids overtraining of multi-layer DNNs. We hence conclude that machines hear better when realistic cochlear models are used at the input of DNNs.