CVAINov 3, 2024

Capsule Vision Challenge 2024: Multi-Class Abnormality Classification for Video Capsule Endoscopy

arXiv:2411.01479v11 citationsh-index: 1
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of automated abnormality detection in gastrointestinal video capsule endoscopy for medical diagnosis, representing an incremental improvement with a flexible training pipeline.

The study tackled multi-class abnormality classification in video capsule endoscopy frames by implementing a tiered augmentation strategy to address data imbalance and progressively structured training tasks to handle learning complexities. The approach, tested with ResNet50 and a custom ViT-CNN hybrid model, demonstrated a scalable pipeline for this medical imaging task.

This study presents an approach to developing a model for classifying abnormalities in video capsule endoscopy (VCE) frames. Given the challenges of data imbalance, we implemented a tiered augmentation strategy using the albumentations library to enhance minority class representation. Additionally, we addressed learning complexities by progressively structuring training tasks, allowing the model to differentiate between normal and abnormal cases and then gradually adding more specific classes based on data availability. Our pipeline, developed in PyTorch, employs a flexible architecture enabling seamless adjustments to classification complexity. We tested our approach using ResNet50 and a custom ViT-CNN hybrid model, with training conducted on the Kaggle platform. This work demonstrates a scalable approach to abnormality classification in VCE.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes