IVARCVJul 31, 2025

Smart Video Capsule Endoscopy: Raw Image-Based Localization for Enhanced GI Tract Investigation

arXiv:2507.23398v11 citationsh-index: 4ICONIP
AI Analysis

This enables longer battery life for medical capsule endoscopes, improving small intestine investigation, though it is an incremental hardware-software co-design approach.

The paper tackles the problem of running deep neural networks on resource-constrained edge devices like video capsule endoscopes by developing a CNN that processes raw Bayer images directly, achieving 93.06% accuracy for organ classification and reducing energy consumption by 89.9% compared to traditional methods.

For many real-world applications involving low-power sensor edge devices deep neural networks used for image classification might not be suitable. This is due to their typically large model size and require- ment of operations often exceeding the capabilities of such resource lim- ited devices. Furthermore, camera sensors usually capture images with a Bayer color filter applied, which are subsequently converted to RGB images that are commonly used for neural network training. However, on resource-constrained devices, such conversions demands their share of energy and optimally should be skipped if possible. This work ad- dresses the need for hardware-suitable AI targeting sensor edge devices by means of the Video Capsule Endoscopy, an important medical proce- dure for the investigation of the small intestine, which is strongly limited by its battery lifetime. Accurate organ classification is performed with a final accuracy of 93.06% evaluated directly on Bayer images involv- ing a CNN with only 63,000 parameters and time-series analysis in the form of Viterbi decoding. Finally, the process of capturing images with a camera and raw image processing is demonstrated with a customized PULPissimo System-on-Chip with a RISC-V core and an ultra-low power hardware accelerator providing an energy-efficient AI-based image clas- sification approach requiring just 5.31 μJ per image. As a result, it is possible to save an average of 89.9% of energy before entering the small intestine compared to classic video capsules.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes