CVAIAug 14, 2025

Medico 2025: Visual Question Answering for Gastrointestinal Imaging

arXiv:2508.10869v11 citationsh-index: 18Has Code
Originality Synthesis-oriented
AI Analysis

This addresses the need for trustworthy AI in medical image analysis for clinicians, though it is incremental as it builds on existing VQA and XAI methods in a specific domain.

The Medico 2025 challenge tackles Visual Question Answering for Gastrointestinal imaging by developing Explainable AI models to answer clinical questions from endoscopy images with interpretable justifications, using a dataset of 6,500 images and 159,549 QA pairs as a benchmark.

The Medico 2025 challenge addresses Visual Question Answering (VQA) for Gastrointestinal (GI) imaging, organized as part of the MediaEval task series. The challenge focuses on developing Explainable Artificial Intelligence (XAI) models that answer clinically relevant questions based on GI endoscopy images while providing interpretable justifications aligned with medical reasoning. It introduces two subtasks: (1) answering diverse types of visual questions using the Kvasir-VQA-x1 dataset, and (2) generating multimodal explanations to support clinical decision-making. The Kvasir-VQA-x1 dataset, created from 6,500 images and 159,549 complex question-answer (QA) pairs, serves as the benchmark for the challenge. By combining quantitative performance metrics and expert-reviewed explainability assessments, this task aims to advance trustworthy Artificial Intelligence (AI) in medical image analysis. Instructions, data access, and an updated guide for participation are available in the official competition repository: https://github.com/simula/MediaEval-Medico-2025

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes