LGAIMLMay 29

InfoAtlas: A Foundation Model for Zero-Shot Statistical Dependence Estimate

arXiv:2606.0024183.1h-index: 1
Predicted impact top 13% in LG · last 90 daysOriginality Highly original
AI Analysis

This work addresses the computational bottleneck of neural MI estimation for real-time applications, enabling fast dependency analysis across diverse datasets.

InfoAtlas is a foundation model that estimates mutual information in a single forward pass, achieving 100x speedup over neural MI estimators while matching their accuracy, and generalizing to real-world scenarios.

Measuring statistical dependency between high-dimensional random variables is a fundamental task in data science and machine learning. Neural mutual information (MI) estimators offer a promising avenue, but they typically require costly iterative optimization for each new dataset, making them impractical for real-time applications. We present InfoAtlas, a foundation model-like architecture that eliminates this bottleneck by directly inferring MI in a single forward pass. Pretrained on large-scale synthetic data with rich dependence patterns, InfoAtlas learns to identify diverse dependence structures and predict MI directly from the dataset. Comprehensive experiments demonstrate that InfoAtlas matches state-of-the-art neural estimators in accuracy while achieving $100\times$ speedup, can flexibly handle varying dimensions and sample sizes through a single unified model, and generalizes effectively to complex, real-world scenarios. By reformulating MI estimation as an inference task, InfoAtlas establishes a foundation for real-time dependency analysis.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes