CV AIJun 5

When is 3D Worth It? A Resource-Performance Frontier for CNNs and Transformers in Lung CT

Md Enamul Hoq, Sharafat Hossain, Imraul Emmaka, Linda Larson-Prior, Lawrence Tarbox, Jonathan Bona, Donald Johann Jr. and Fred Prior

arXiv:2606.0695011.4

Originality Synthesis-oriented

AI Analysis

For practitioners in volumetric medical imaging, this work provides a resource-performance frontier and failure taxonomy, but the results are preliminary with wide confidence intervals and no definitive superiority claims.

This paper investigates the trade-offs between input dimensionality (2D, 2.5D, 3D) for CNNs and Vision Transformers in lung CT classification, finding that 2.5D CNNs offer the best balance of discrimination and stability (ROC-AUC 0.682) while 3D models suffer from threshold instability and transformers produce degenerate predictions.

Three-dimensional models are widely assumed preferable for volumetric medical imaging, yet their practical value depends on whether performance gains justify added computational cost and complexity. Rather than proposing a new architecture, we study how input dimensionality (2D, 2.5D, 3D) affects model behavior across convolutional neural networks (CNNs) and Vision Transformers (ViTs) under a fixed training protocol. Using a leakage-free NLST cohort (n = 1,977) with supporting LIDC-IDRI data, we find that the 2.5D CNN offers the most favorable discrimination-stability trade-off in our comparison (ROC-AUC 0.682, 95% CI [0.546, 0.799]) with a stable operating point. In contrast, 3D CNNs show threshold instability, and transformers exhibit degenerate predictions, such as all-positive predictions. Confidence intervals are wide and overlapping, so we present these results as a controlled resource-performance frontier and a failure-mode taxonomy rather than as definitive superiority claims. For class-imbalanced lung cancer screening classification, 2D and 2.5D inputs provide a more reliable trade-off between performance, stability, and computational efficiency than full 3D representations.

View on arXiv PDF

Similar