AINov 12, 2025

What We Don't C: Representations for scientific discovery beyond VAEs

arXiv:2511.09433v1h-index: 4
Originality Incremental advance
AI Analysis

This work addresses the challenge of analyzing and controlling latent representations in high-dimensional domains, which is crucial for scientific exploration, though it appears incremental as it builds on existing generative model techniques.

The paper tackles the problem of accessing information in learned representations for scientific discovery by introducing a method based on latent flow matching with classifier-free guidance to disentangle latent subspaces, showing it enables access to meaningful features across synthetic and real datasets.

Accessing information in learned representations is critical for scientific discovery in high-dimensional domains. We introduce a novel method based on latent flow matching with classifier-free guidance that disentangles latent subspaces by explicitly separating information included in conditioning from information that remains in the residual representation. Across three experiments -- a synthetic 2D Gaussian toy problem, colored MNIST, and the Galaxy10 astronomy dataset -- we show that our method enables access to meaningful features of high dimensional data. Our results highlight a simple yet powerful mechanism for analyzing, controlling, and repurposing latent representations, providing a pathway toward using generative models for scientific exploration of what we don't capture, consider, or catalog.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes