LGDec 1, 2025

milearn: A Python Package for Multi-Instance Machine Learning

arXiv:2512.01287v1
Originality Synthesis-oriented
AI Analysis

This provides a practical tool for researchers and practitioners in fields like bioinformatics and computer vision dealing with small MIL datasets, though it is incremental as it builds on existing MIL methods.

The authors tackled the lack of a unified Python package for multi-instance learning (MIL) by introducing milearn, which integrates classical and neural-network-based algorithms with built-in hyperparameter optimization, demonstrating its versatility on synthetic benchmarks like digit classification and protein-protein interaction prediction.

We introduce milearn, a Python package for multi-instance learning (MIL) that follows the familiar scikit-learn fit/predict interface while providing a unified framework for both classical and neural-network-based MIL algorithms for regression and classification. The package also includes built-in hyperparameter optimization designed specifically for small MIL datasets, enabling robust model selection in data-scarce scenarios. We demonstrate the versatility of milearn across a broad range of synthetic MIL benchmark datasets, including digit classification and regression, molecular property prediction, and protein-protein interaction (PPI) prediction. Special emphasis is placed on the key instance detection (KID) problem, for which the package provides dedicated support.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes