CV AINov 3, 2024

Capsule Vision Challenge 2024: Multi-Class Abnormality Classification for Video Capsule Endoscopy

Aakarsh Bansal, Bhuvanesh Singla, Raajan Rajesh Wankhade, Nagamma Patil

arXiv:2411.01479v13.71 citationsh-index: 1

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of automated abnormality detection in gastrointestinal video capsule endoscopy for medical diagnosis, representing an incremental improvement with a flexible training pipeline.

The study tackled multi-class abnormality classification in video capsule endoscopy frames by implementing a tiered augmentation strategy to address data imbalance and progressively structured training tasks to handle learning complexities. The approach, tested with ResNet50 and a custom ViT-CNN hybrid model, demonstrated a scalable pipeline for this medical imaging task.

This study presents an approach to developing a model for classifying abnormalities in video capsule endoscopy (VCE) frames. Given the challenges of data imbalance, we implemented a tiered augmentation strategy using the albumentations library to enhance minority class representation. Additionally, we addressed learning complexities by progressively structuring training tasks, allowing the model to differentiate between normal and abnormal cases and then gradually adding more specific classes based on data availability. Our pipeline, developed in PyTorch, employs a flexible architecture enabling seamless adjustments to classification complexity. We tested our approach using ResNet50 and a custom ViT-CNN hybrid model, with training conducted on the Kaggle platform. This work demonstrates a scalable approach to abnormality classification in VCE.

View on arXiv PDF

Similar