CVJul 4, 2025

CPKD: Clinical Prior Knowledge-Constrained Diffusion Models for Surgical Phase Recognition in Endoscopic Submucosal Dissection

arXiv:2507.03295v2h-index: 5
Originality Highly original
AI Analysis

This addresses the bottleneck of reliable phase recognition in endoscopic workflows for improving computer-assisted surgical systems, representing a novel method for a known bottleneck.

The paper tackles surgical phase recognition in endoscopic submucosal dissection by proposing CPKD, a diffusion-based generative framework that incorporates clinical prior knowledge, achieving superior or comparable performance to state-of-the-art methods on multiple datasets.

Gastrointestinal malignancies constitute a leading cause of cancer-related mortality worldwide, with advanced-stage prognosis remaining particularly dismal. Originating as a groundbreaking technique for early gastric cancer treatment, Endoscopic Submucosal Dissection has evolved into a versatile intervention for diverse gastrointestinal lesions. While computer-assisted systems significantly enhance procedural precision and safety in ESD, their clinical adoption faces a critical bottleneck: reliable surgical phase recognition within complex endoscopic workflows. Current state-of-the-art approaches predominantly rely on multi-stage refinement architectures that iteratively optimize temporal predictions. In this paper, we present Clinical Prior Knowledge-Constrained Diffusion (CPKD), a novel generative framework that reimagines phase recognition through denoising diffusion principles while preserving the core iterative refinement philosophy. This architecture progressively reconstructs phase sequences starting from random noise and conditioned on visual-temporal features. To better capture three domain-specific characteristics, including positional priors, boundary ambiguity, and relation dependency, we design a conditional masking strategy. Furthermore, we incorporate clinical prior knowledge into the model training to improve its ability to correct phase logical errors. Comprehensive evaluations on ESD820, Cholec80, and external multi-center demonstrate that our proposed CPKD achieves superior or comparable performance to state-of-the-art approaches, validating the effectiveness of diffusion-based generative paradigms for surgical phase recognition.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes