CVApr 8, 2025

Enhanced Anomaly Detection for Capsule Endoscopy Using Ensemble Learning Strategies

Julia Werner, Christoph Gerum, Jorg Nick, Maxime Le Floch, Franz Brinkmann, Jochen Hampe, Oliver Bringmann

arXiv:2504.06039v28.45 citationsh-index: 6EMBC

Originality Incremental advance

AI Analysis

This work addresses the challenge of limited data and model size constraints for anomaly detection in capsule endoscopy, which is incremental in improving efficiency for medical diagnostics.

The paper tackles anomaly detection in capsule endoscopy by proposing an ensemble learning strategy with varied loss functions, achieving AUC scores of 76.86% on Kvasir-Capsule and 76.98% on Galar datasets while using fewer parameters than baselines.

Capsule endoscopy is a method to capture images of the gastrointestinal tract and screen for diseases which might remain hidden if investigated with standard endoscopes. Due to the limited size of a video capsule, embedding AI models directly into the capsule demands careful consideration of the model size and thus complicates anomaly detection in this field. Furthermore, the scarcity of available data in this domain poses an ongoing challenge to achieving effective anomaly detection. Thus, this work introduces an ensemble strategy to address this challenge in anomaly detection tasks in video capsule endoscopies, requiring only a small number of individual neural networks during both the training and inference phases. Ensemble learning combines the predictions of multiple independently trained neural networks. This has shown to be highly effective in enhancing both the accuracy and robustness of machine learning models. However, this comes at the cost of higher memory usage and increased computational effort, which quickly becomes prohibitive in many real-world applications. Instead of applying the same training algorithm to each individual network, we propose using various loss functions, drawn from the anomaly detection field, to train each network. The methods are validated on the two largest publicly available datasets for video capsule endoscopy images, the Galar and the Kvasir-Capsule dataset. We achieve an AUC score of 76.86% on the Kvasir-Capsule and an AUC score of 76.98% on the Galar dataset. Our approach outperforms current baselines with significantly fewer parameters across all models, which is a crucial step towards incorporating artificial intelligence into capsule endoscopies.

View on arXiv PDF

Similar