15.3CVMar 10
FetalAgents: A Multi-Agent System for Fetal Ultrasound Image and Video AnalysisXiaotian Hu, Junwei Huang, Mingxuan Liu et al.
Fetal ultrasound (US) is the primary imaging modality for prenatal screening, yet its interpretation relies heavily on the expertise of the clinician. Despite advances in deep learning and foundation models, existing automated tools for fetal US analysis struggle to balance task-specific accuracy with the whole-process versatility required to support end-to-end clinical workflows. To address these limitations, we propose FetalAgents, the first multi-agent system for comprehensive fetal US analysis. Through a lightweight, agentic coordination framework, FetalAgents dynamically orchestrates specialized vision experts to maximize performance across diagnosis, measurement, and segmentation. Furthermore, FetalAgents advances beyond static image analysis by supporting end-to-end video stream summarization, where keyframes are automatically identified across multiple anatomical planes, analyzed by coordinated experts, and synthesized with patient metadata into a structured clinical report. Extensive multi-center external evaluations across eight clinical tasks demonstrate that FetalAgents consistently delivers the most robust and accurate performance when compared against specialized models and multimodal large language models (MLLMs), ultimately providing an auditable, workflow-aligned solution for fetal ultrasound analysis and reporting.
LGAug 13, 2025Code
GraphTreeGen: Subtree-Centric Approach to Efficient and Supervised Graph GenerationYitong Luo, Islem Rekik
Brain connectomes, representing neural connectivity as graphs, are crucial for understanding brain organization but costly and time-consuming to acquire, motivating generative approaches. Recent advances in graph generative modeling offer a data-driven alternative, enabling synthetic connectome generation and reducing dependence on large neuroimaging datasets. However, current models face key limitations: (i) compressing the whole graph into a single latent code (e.g., VGAEs) blurs fine-grained local motifs; (ii) relying on rich node attributes rarely available in connectomes reduces reconstruction quality; (iii) edge-centric models emphasize topology but overlook accurate edge-weight prediction, harming quantitative fidelity; and (iv) computationally expensive designs (e.g., edge-conditioned convolutions) impose high memory demands, limiting scalability. We propose GraphTreeGen (GTG), a subtree-centric generative framework for efficient, accurate connectome synthesis. GTG decomposes each connectome into entropy-guided k-hop trees capturing informative local structure, encoded by a shared GCN. A bipartite message-passing layer fuses subtree embeddings with global node features, while a dual-branch decoder jointly predicts edge existence and weights to reconstruct the adjacency matrix. GTG outperforms state-of-the-art baselines in self-supervised tasks and remains competitive in supervised settings, delivering higher structural fidelity and more precise weights with far less memory. Its modular design enables extensions to connectome super-resolution and cross-modality synthesis. Code: https://github.com/basiralab/GTG/
AIMar 6, 2025
ValuePilot: A Two-Phase Framework for Value-Driven Decision-MakingYitong Luo, Hou Hei Lam, Ziang Chen et al.
Despite recent advances in artificial intelligence (AI), it poses challenges to ensure personalized decision-making in tasks that are not considered in training datasets. To address this issue, we propose ValuePilot, a two-phase value-driven decision-making framework comprising a dataset generation toolkit DGT and a decision-making module DMM trained on the generated data. DGT is capable of generating scenarios based on value dimensions and closely mirroring real-world tasks, with automated filtering techniques and human curation to ensure the validity of the dataset. In the generated dataset, DMM learns to recognize the inherent values of scenarios, computes action feasibility and navigates the trade-offs between multiple value dimensions to make personalized decisions. Extensive experiments demonstrate that, given human value preferences, our DMM most closely aligns with human decisions, outperforming Claude-3.5-Sonnet, Gemini-2-flash, Llama-3.1-405b and GPT-4o. This research is a preliminary exploration of value-driven decision-making. We hope it will stimulate interest in value-driven decision-making and personalized decision-making within the community.