CVApr 30

Echo-α: Large Agentic Multimodal Reasoning Model for Ultrasound Interpretation

arXiv:2604.2801198.6Has Code
Predicted impact top 3% in CV · last 90 daysOriginality Highly original
AI Analysis

Provides a practical, interpretable, and transferable AI system for ultrasound interpretation that combines specialized detectors with multimodal reasoning.

Echo-α unifies lesion localization and clinical reasoning for ultrasound interpretation using an invoke-and-reason framework, achieving 56.73%/43.78% F1@0.5 for grounding and 74.90%/49.20% overall accuracy for diagnosis on cross-center renal/breast ultrasound benchmarks.

Ultrasound interpretation requires both precise lesion localization and holistic clinical reasoning, yet existing methods typically excel at only one of these capabilities: specialized detectors offer strong localization but limited reasoning, whereas multimodal large language models (MLLMs) provide flexible reasoning but weak grounding in specialized medical domains. We present Echo-α, an agentic multimodal reasoning model for ultrasound interpretation that unifies these strengths within an invoke-and-reason framework. Echo-α is trained to coordinate organ-specific detector outputs, integrate them with global visual context, and convert the resulting evidence into grounded diagnostic decisions beyond detector-only inference. This behavior is established through a nine-task supervised curriculum and then refined by sequential reinforcement learning under different reward trade-offs, yielding Echo-α-Grounding for lesion anchoring and Echo-α-Diagnosis for final diagnosis. On multi-center renal and breast ultrasound benchmarks, Echo-α outperforms competitive baselines on both grounding and diagnosis. In particular, on cross-center test sets, Echo-α-Grounding attains 56.73%/43.78% F1@0.5 and Echo- α-Diagnosis reaches 74.90%/49.20% overall accuracy on renal/breast ultrasound. These results suggest that agentic multimodal reasoning can turn specialized detectors into verifiable clinical evidence, offering a practical route toward ultrasound AI systems that are more accurate, interpretable, and transferable. The repository is at https://github.com/MiliLab/Echo-Alpha.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes