CV AIJan 13, 2025

UNetVL: Enhancing 3D Medical Image Segmentation with Chebyshev KAN Powered Vision-LSTM

Xuhui Guo, Tanmoy Dam, Rohan Dhamdhere, Gourav Modanwal, Anant Madabhushi

Georgia Tech

arXiv:2501.07017v26.22 citationsh-index: 7Has CodeISBI

Originality Incremental advance

AI Analysis

This addresses a key bottleneck in medical imaging for researchers and practitioners, offering incremental improvements over existing methods.

The paper tackles the challenge of balancing long-range dependency acquisition with computational efficiency in 3D medical image segmentation by proposing UNETVL, which improves mean Dice scores by 7.3% on ACDC and 15.6% on AMOS compared to UNETR.

3D medical image segmentation has progressed considerably due to Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs), yet these methods struggle to balance long-range dependency acquisition with computational efficiency. To address this challenge, we propose UNETVL (U-Net Vision-LSTM), a novel architecture that leverages recent advancements in temporal information processing. UNETVL incorporates Vision-LSTM (ViL) for improved scalability and memory functions, alongside an efficient Chebyshev Kolmogorov-Arnold Networks (KAN) to handle complex and long-range dependency patterns more effectively. We validated our method on the ACDC and AMOS2022 (post challenge Task 2) benchmark datasets, showing a significant improvement in mean Dice score compared to recent state-of-the-art approaches, especially over its predecessor, UNETR, with increases of 7.3% on ACDC and 15.6% on AMOS, respectively. Extensive ablation studies were conducted to demonstrate the impact of each component in UNETVL, providing a comprehensive understanding of its architecture. Our code is available at https://github.com/tgrex6/UNETVL, facilitating further research and applications in this domain.

View on arXiv PDF Code

Similar