CVAIIVNov 15, 2025

Prompt-Conditioned FiLM and Multi-Scale Fusion on MedSigLIP for Low-Dose CT Quality Assessment

arXiv:2511.12256v1h-index: 10
Originality Incremental advance
AI Analysis

This work addresses medical image quality assessment for low-dose CT scans, offering a data-efficient and adaptable method, though it appears incremental as it builds on existing MedSigLIP and FiLM techniques.

The paper tackles low-dose CT quality assessment by proposing a prompt-conditioned framework that injects textual priors via FiLM and multi-scale pooling, achieving PLCC = 0.9575, SROCC = 0.9561, and KROCC = 0.8301 on the LDCTIQA2023 dataset, surpassing prior challenge submissions.

We propose a prompt-conditioned framework built on MedSigLIP that injects textual priors via Feature-wise Linear Modulation (FiLM) and multi-scale pooling. Text prompts condition patch-token features on clinical intent, enabling data-efficient learning and rapid adaptation. The architecture combines global, local, and texture-aware pooling through separate regression heads fused by a lightweight MLP, trained with pairwise ranking loss. Evaluated on the LDCTIQA2023 (a public LDCT quality assessment challenge) with 1,000 training images, we achieve PLCC = 0.9575, SROCC = 0.9561, and KROCC = 0.8301, surpassing the top-ranked published challenge submissions and demonstrating the effectiveness of our prompt-guided approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes