ASMar 3, 2021Code
Open community platform for hearing aid algorithm research: open Master Hearing Aid (openMHA)Hendrik Kayser, Tobias Herzke, Paul Maanen et al.
open Master Hearing Aid (openMHA) was developed and provided to the hearing aid research community as an open-source software platform with the aim to support sustainable and reproducible research towards improvement and new types of assistive hearing systems not limited by proprietary software. The software offers a flexible framework that allows the users to conduct hearing aid research using tools and a number of signal processing plugins provided with the software as well as the implementation of own methods. The openMHA software is independent of a specific hardware and supports Linux, macOS and Windows operating systems as well as 32-bit and 64-bit ARM-based architectures such as used in small portable integrated systems. www.openmha.org
SDJul 14, 2021
The Period-Modulated Harmonic Locked Loop (PM-HLL): A low-effort algorithm for rapid time-domain multi-periodicity estimationVolker Hohmann
Many speech and music analysis and processing schemes rely on an estimate of the fundamental frequency $f_0$ of periodic signal components. Most established schemes apply rather unspecific signal models such as sinusoidal models to the estimation problem, which may limit time resolution and estimation accuracy. This study proposes a novel time-domain locked-loop algorithm with low computational effort and low memory footprint for $f_0$ estimation. The loop control signal is directly derived from the input time signal, using a harmonic signal model. Theoretically, this allows for a noise-robust and rapid $f_0$ estimation for periodic signals of arbitrary waveform, and without the requirement of a prior frequency analysis. Several simulations with short signals employing different types of periodicity and with added wide-band noise were performed to demonstrate and evaluate the basic properties of the proposed algorithm. Depending on the Signal-to-Noise Ratio (SNR), the estimator was found to converge within 3-4 signal repetitions, even at SNR close to or below 0dB. Furthermore, it was found to follow fundamental frequency sweeps with a delay of less than one period and to track all tones of a three-tone musical chord signal simultaneously. Quasi-periodic sounds with shifted harmonics as well as signals with stochastic periodicity were robustly tracked. Mean and standard deviation of the estimation error, i.e., the difference between true and estimated $f_0$, were at or below 1 Hz in most cases. The results suggest that the proposed algorithm may be applicable to low-delay speech and music analysis and processing.
HCApr 3, 2020
Comparison of a Head-Mounted Display and a Curved Screen in a Multi-Talker Audiovisual Listening TaskGerard Llorach, Maartje M. E. Hendrikse, Giso Grimm et al.
Introduction: Virtual audiovisual technology and its methodology has yet to be established for psychoacoustic research. This study examined the effects of different audiovisual conditions on preference when listening to multi-talker conversations. The study's goal is to explore and assess audiovisual technologies in the context of hearing research. Methods: The participants listened to audiovisual conversations between four talkers. Two displays were tested and compared: a curved screen (CS) and a head-mounted display (HMD). Using three visual conditions (audio-only, virtual characters and video recordings), three groups of participants were tested: seventeen young normal-hearing, ten older normal-hearing, and ten older hearing-impaired listeners. Results: Open interviews showed that the CS was preferred over the HMD for older normal-hearing participants and that video recordings were the preferred visual condition. Young and older hearing-impaired participants did not show a preference between the CS and the HMD. Conclusions: CSs and video recordings should be the preferred audiovisual setup of laboratories and clinics, although HMDs and virtual characters can be used for hearing research when necessary and suitable.
MED-PHNov 16, 2018
Influence of visual cues on head and eye movements during listening tasks in multi-talker audiovisual environments with animated charactersMaartje M. E. Hendrikse, Gerard Llorach, Giso Grimm et al.
Recent studies of hearing aid benefits indicate that head movement behavior influences performance. To systematically assess these effects, movement behavior must be measured in realistic communication conditions. For this, the use of virtual audiovisual environments with animated characters as visual stimuli has been proposed. It is unclear, however, how these animations influence the head- and eye-movement behavior of subjects. Here, two listening tasks were carried out with a group of 14 young normal hearing subjects to investigate the influence of visual cues on head- and eye-movement behavior; on combined localization and speech intelligibility task performance; as well as on perceived speech intelligibility, perceived listening effort and the general impression of the audiovisual environments. Animated characters with different lip-syncing and gaze patterns were compared to an audio-only condition and to a video of real persons. Results show that movement behavior, task performance, and perception were all influenced by visual cues. The movement behavior of young normal hearing listeners in animation conditions with lip-syncing was similar to that in the video condition. These results in young normal hearing listeners are a first step towards using the animated characters to assess the influence of head movement behavior on hearing aid performance.
SDApr 30, 2018
A toolbox for rendering virtual acoustic environments in the context of audiologyGiso Grimm, Joanna Luberadzka, Volker Hohmann
A toolbox for creation and rendering of dynamic virtual acoustic environments (TASCAR) that allows direct user interaction was developed for application in hearing aid research and audiology. This technical paper describes the general software structure and the time-domain simulation methods, i.e., transmission model, image source model, and render formats, used to produce virtual acoustic environments with moving objects. Implementation-specific properties are described, and the computational performance of the system was measured as a function of simulation complexity. Results show that on commercially available commonly used hardware the simulation of several hundred virtual sound sources is possible in the time domain.
SDNov 11, 2015
Combination of binaural and harmonic masking release effects in the detection of a single component in complex tonesMartin Klein-Hennig, Mathias Dietz, Volker Hohmann
Both harmonic and binaural signal properties are relevant for auditory processing. To investigate how these cues combine in the auditory system, detection thresholds for an 800-Hz tone masked by a diotic (i.e., identical between the ears) harmonic complex tone were measured in six normal-hearing subjects. The target tone was presented either diotically or with an interaural phase difference (IPD) of 180 degree and in either harmonic or "mistuned" relationship to the diotic masker. Three different maskers were used, a resolved and an unresolved complex tone (fundamental frequency: 160 and 40 Hz) with four components below and above the target frequency and a broadband unresolved complex tone with 12 additional components. The target IPD provided release from masking in most masker conditions, whereas mistuning led to a significant release from masking only in the diotic conditions with the resolved and the narrowband unresolved maskers. A significant effect of mistuning was neither found in the diotic condition with the wideband unresolved masker nor in any of the dichotic conditions. An auditory model with a single analysis frequency band and different binaural processing schemes was employed to predict the data of the unresolved masker conditions. Sensitivity to modulation cues was achieved by including an auditory-motivated modulation filter in the processing pathway. The predictions of the diotic data were in line with the experimental results and literature data in the narrowband condition, but not in the broadband condition, suggesting that across-frequency processing is involved in processing modulation information. The experimental and model results in the dichotic conditions show that the binaural processor cannot exploit modulation information in binaurally unmasked conditions.
SDMar 24, 2015
Online Monaural Speech Enhancement Based on Periodicity Analysis and A Priori SNR EstimationZhangli Chen, Volker Hohmann
This paper describes an online algorithm for enhancing monaural noisy speech. Firstly, a novel phase-corrected low-delay gammatone filterbank is derived for signal subband decomposition and resynthesis; the subband signals are then analyzed frame by frame. Secondly, a novel feature named periodicity degree (PD) is proposed to be used for detecting and estimating the fundamental period (P0) in each frame and for estimating the signal-to-noise ratio (SNR) in each frame-subband signal unit. The PD is calculated in each unit as the multiplication of the normalized autocorrelation and the comb filter ratio, and shown to be robust in various low-SNR conditions. Thirdly, the noise energy level in each signal unit is estimated recursively based on the estimated SNR for units with high PD and based on the noisy signal energy level for units with low PD. Then the a priori SNR is estimated using a decision-directed approach with the estimated noise level. Finally, a revised Wiener gain is calculated, smoothed, and applied to each unit; the processed units are summed across subbands and frames to form the enhanced signal. The P0 detection accuracy of the algorithm was evaluated on two corpora and showed comparable performance on one corpus and better performance on the other corpus when compared to a recently published pitch detection algorithm. The speech enhancement effect of the algorithm was evaluated on one corpus with two objective criteria and showed better performance in one highly non-stationary noise and comparable performance in two other noises when compared to a state-of-the-art statistical-model based algorithm.
SDMar 2, 2015
Evaluation of spatial audio reproduction schemes for application in hearing aid researchGiso Grimm, Stephan Ewert, Volker Hohmann
Loudspeaker-based spatial audio reproduction schemes are increasingly used for evaluating hearing aids in complex acoustic conditions. To further establish the feasibility of this approach, this study investigated the interaction between spatial resolution of different reproduction methods and technical and perceptual hearing aid performance measures using computer simulations. Three spatial audio reproduction methods -- discrete speakers, vector base amplitude panning and higher order ambisonics -- were compared in regular circular loudspeaker arrays with 4 to 72 channels. The influence of reproduction method and array size on performance measures of representative multi-microphone hearing aid algorithm classes with spatially distributed microphones and a representative single channel noise-reduction algorithm was analyzed. Algorithm classes differed in their way of analyzing and exploiting spatial properties of the sound field, requiring different accuracy of sound field reproduction. Performance measures included beam pattern analysis, signal-to-noise ratio analysis, perceptual localization prediction, and quality modeling. The results show performance differences and interaction effects between reproduction method and algorithm class that may be used for guidance when selecting the appropriate method and number of speakers for specific tasks in hearing aid research.