LGFeb 25, 2022

Multi-Instance Causal Representation Learning for Instance Label Prediction and Out-of-Distribution Generalization

Weijia Zhang, Xuanhui Zhang, Han-Wen Deng, Min-Ling Zhang

arXiv:2202.12570v311.128 citationsHas Code

Originality Highly original

AI Analysis

This work addresses the challenge of instance label prediction in multi-instance learning for applications like medical imaging or text classification, offering a novel causal approach that enhances robustness to distribution changes.

The paper tackled the problem of predicting instance labels from bag-level supervision in multi-instance learning by treating bags as auxiliary information to identify causal representations, resulting in significant performance improvements on instance label prediction and out-of-distribution generalization tasks.

Multi-instance learning (MIL) deals with objects represented as bags of instances and can predict instance labels from bag-level supervision. However, significant performance gaps exist between instance-level MIL algorithms and supervised learners since the instance labels are unavailable in MIL. Most existing MIL algorithms tackle the problem by treating multi-instance bags as harmful ambiguities and predicting instance labels by reducing the supervision inexactness. This work studies MIL from a new perspective by considering bags as auxiliary information, and utilize it to identify instance-level causal representations from bag-level weak supervision. We propose the CausalMIL algorithm, which not only excels at instance label prediction but also provides robustness to distribution change by synergistically integrating MIL with identifiable variational autoencoder. Our approach is based on a practical and general assumption: the prior distribution over the instance latent representations belongs to the non-factorized exponential family conditioning on the multi-instance bags. Experiments on synthetic and real-world datasets demonstrate that our approach significantly outperforms various baselines on instance label prediction and out-of-distribution generalization tasks.

View on arXiv PDF Code

Similar