CVSDASJan 8, 2025

Open-Source Manually Annotated Vocal Tract Database for Automatic Segmentation from 3D MRI Using Deep Learning: Benchmarking 2D and 3D Convolutional and Transformer Networks

arXiv:2501.06229v25 citationsh-index: 21J Voice
AI Analysis

This work addresses the need for efficient and accurate vocal tract segmentation for voice and speech applications, but appears incremental as it benchmarks existing methods on a new dataset.

The study tackled the problem of time-intensive and error-prone manual segmentation of the vocal tract from 3D MRI data by evaluating deep learning algorithms for automatic segmentation, benchmarking 2D and 3D convolutional and transformer networks.

Accurate segmentation of the vocal tract from magnetic resonance imaging (MRI) data is essential for various voice and speech applications. Manual segmentation is time intensive and susceptible to errors. This study aimed to evaluate the efficacy of deep learning algorithms for automatic vocal tract segmentation from 3D MRI.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes