LGCVIVQMMLJul 26, 2023

Topological Inductive Bias fosters Multiple Instance Learning in Data-Scarce Scenarios

arXiv:2307.14025v3h-index: 6
Originality Incremental advance
AI Analysis

This addresses the challenge of MIL effectiveness in data-scarce settings, particularly for rare disease classification, with incremental improvements over existing methods.

The paper tackled the problem of multiple instance learning (MIL) performance dropping sharply in data-scarce scenarios, such as rare disease classification, by incorporating topological inductive biases into the MIL framework, resulting in average performance improvements of 15.3% for synthetic datasets, 2.8% for benchmarks, and 5.5% for rare anemia classification.

Multiple instance learning (MIL) is a framework for weakly supervised classification, where labels are assigned to sets of instances, i.e., bags, rather than to individual data points. This paradigm has proven effective in tasks where fine-grained annotations are unavailable or costly to obtain. However, the effectiveness of MIL drops sharply when training data are scarce, such as for rare disease classification. To address this challenge, we propose incorporating topological inductive biases into the data representation space within the MIL framework. This bias introduces a topology-preserving constraint that encourages the instance encoder to maintain the topological structure of the instance distribution within each bag when mapping them to MIL latent space. As a result, our Topology Guided MIL (TG-MIL) method enhances the performance and generalizability of MIL classifiers across different aggregation functions, especially under scarce-data regimes. Our evaluations show average performance improvements of 15.3% for synthetic MIL datasets, 2.8% for MIL benchmarks, and 5.5% for rare anemia classification compared to current state-of-the-art MIL models, where only 17-120 samples per class are available. We make our code publicly available.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes