CVOct 26, 2022

Rapid and robust endoscopic content area estimation: A lean GPU-based pipeline and curated benchmark dataset

Charlie Budd, Luis C. Garcia-Peraza-Herrera, Martin Huber, Sebastien Ourselin, Tom Vercauteren

arXiv:2210.14771v16.58 citationsh-index: 114Has Code

Originality Incremental advance

AI Analysis

This work addresses a common but challenging task in endoscopic image processing for medical applications, providing a curated dataset and efficient algorithms to enable further research.

The paper tackles the problem of estimating the informative content area in endoscopic footage by proposing a lean GPU-based pipeline with two variants, and introduces a first-of-its-kind benchmark dataset. The result shows significant improvements over a state-of-the-art U-Net-based approach, with accuracy measured by Hausdorff distance reduced from 118.1 px to 6.3 px and computational time per frame reduced from 11.2 ms to 0.13 ms.

Endoscopic content area refers to the informative area enclosed by the dark, non-informative, border regions present in most endoscopic footage. The estimation of the content area is a common task in endoscopic image processing and computer vision pipelines. Despite the apparent simplicity of the problem, several factors make reliable real-time estimation surprisingly challenging. The lack of rigorous investigation into the topic combined with the lack of a common benchmark dataset for this task has been a long-lasting issue in the field. In this paper, we propose two variants of a lean GPU-based computational pipeline combining edge detection and circle fitting. The two variants differ by relying on handcrafted features, and learned features respectively to extract content area edge point candidates. We also present a first-of-its-kind dataset of manually annotated and pseudo-labelled content areas across a range of surgical indications. To encourage further developments, the curated dataset, and an implementation of both algorithms, has been made public (https://doi.org/10.7303/syn32148000, https://github.com/charliebudd/torch-content-area). We compare our proposed algorithm with a state-of-the-art U-Net-based approach and demonstrate significant improvement in terms of both accuracy (Hausdorff distance: 6.3 px versus 118.1 px) and computational time (Average runtime per frame: 0.13 ms versus 11.2 ms).

View on arXiv PDF Code

Similar