SD AI ASAug 2, 2025

GeHirNet: A Gender-Aware Hierarchical Model for Voice Pathology Classification

Fan Wu, Kaicheng Zhao, Elgar Fleisch, Filipe Barata

arXiv:2508.01172v14.0h-index: 55

Originality Incremental advance

AI Analysis

This work advances voice pathology classification for medical diagnostics while reducing gender bias, though it appears incremental as it builds on existing ResNet-50 and hierarchical modeling approaches.

The paper tackles the problem of voice pathology classification by addressing gender-related acoustic variations and data scarcity, achieving state-of-the-art performance with 97.63% accuracy and 95.25% MCC, a 5% MCC improvement over baselines.

AI-based voice analysis shows promise for disease diagnostics, but existing classifiers often fail to accurately identify specific pathologies because of gender-related acoustic variations and the scarcity of data for rare diseases. We propose a novel two-stage framework that first identifies gender-specific pathological patterns using ResNet-50 on Mel spectrograms, then performs gender-conditioned disease classification. We address class imbalance through multi-scale resampling and time warping augmentation. Evaluated on a merged dataset from four public repositories, our two-stage architecture with time warping achieves state-of-the-art performance (97.63\% accuracy, 95.25\% MCC), with a 5\% MCC improvement over single-stage baseline. This work advances voice pathology classification while reducing gender bias through hierarchical modeling of vocal characteristics.

View on arXiv PDF

Similar