CVFeb 11

Med-SegLens: Latent-Level Model Diffing for Interpretable Medical Image Segmentation

arXiv:2602.10508v1h-index: 13
Originality Incremental advance
AI Analysis

This addresses interpretability and dataset shift issues in medical image segmentation, offering a practical tool for researchers and clinicians, though it is incremental as it builds on existing segmentation models.

The paper tackled the opacity of medical image segmentation models by introducing Med-SegLens, a framework that uses latent features to diagnose failures and mitigate dataset shift, resulting in error correction in 70% of failure cases and Dice score improvement from 39.4% to 74.2%.

Modern segmentation models achieve strong predictive performance but remain largely opaque, limiting our ability to diagnose failures, understand dataset shift, or intervene in a principled manner. We introduce Med-SegLens, a model-diffing framework that decomposes segmentation model activations into interpretable latent features using sparse autoencoders trained on SegFormer and U-Net. Through cross-architecture and cross-dataset latent alignment across healthy, adult, pediatric, and sub-Saharan African glioma cohorts, we identify a stable backbone of shared representations, while dataset shift is driven by differential reliance on population-specific latents. We show that these latents act as causal bottlenecks for segmentation failures, and that targeted latent-level interventions can correct errors and improve cross-dataset adaption without retraining, recovering performance in 70% of failure cases and improving Dice score from 39.4% to 74.2%. Our results demonstrate that latent-level model diffing provides a practical and mechanistic tool for diagnosing failures and mitigating dataset shift in segmentation models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes