Dan Barry

1.2ASNov 17, 2025

Systematic Evaluation of Time-Frequency Features for Binaural Sound Source Localization

Davoud Shariat Panah, Alessandro Ragano, Dan Barry et al.

This study presents a systematic evaluation of time-frequency feature design for binaural sound source localization (SSL), focusing on how feature selection influences model performance across diverse conditions. We investigate the performance of a convolutional neural network (CNN) model using various combinations of amplitude-based features (magnitude spectrogram, interaural level difference - ILD) and phase-based features (phase spectrogram, interaural phase difference - IPD). Evaluations on in-domain and out-of-domain data with mismatched head-related transfer functions (HRTFs) reveal that carefully chosen feature combinations often outperform increases in model complexity. While two-feature sets such as ILD + IPD are sufficient for in-domain SSL, generalization to diverse content requires richer inputs combining channel spectrograms with both ILD and IPD. Using the optimal feature sets, our low-complexity CNN model achieves competitive performance. Our findings underscore the importance of feature design in binaural SSL and provide practical guidance for both domain-specific and general-purpose localization.

1.2ASApr 14, 2021

Audio-based cough counting using independent subspace analysis

Paul Leamy, Ted Burke, Dan Barry et al.

In this paper, an algorithm designed to detect characteristic cough events in audio recordings is presented, significantly reducing the time required for manual counting. Using time-frequency representations and independent subspace analysis (ISA), sound events that exhibit characteristics of coughs are automatically detected, producing a summary of the events detected. Using a dataset created from publicly available audio recordings, this algorithm has been tested on a variety of synthesized audio scenarios representative of those likely to be encountered by subjects undergoing an ambulatory cough recording, achieving a true positive rate of 76% with an average of 2.85 false positives per minute.

Dan Barry

2 Papers