Ilya Nemenman

AI
11papers
65citations
Novelty48%
AI Score46

11 Papers

AIDec 29, 2022
Intrinsic Motivation in Dynamical Control Systems

Stas Tiomkin, Ilya Nemenman, Daniel Polani et al.

Biological systems often choose actions without an explicit reward signal, a phenomenon known as intrinsic motivation. The computational principles underlying this behavior remain poorly understood. In this study, we investigate an information-theoretic approach to intrinsic motivation, based on maximizing an agent's empowerment (the mutual information between its past actions and future states). We show that this approach generalizes previous attempts to formalize intrinsic motivation, and we provide a computationally efficient algorithm for computing the necessary quantities. We test our approach on several benchmark control problems, and we explain its success in guiding intrinsically motivated behaviors by relating our information-theoretic control function to fundamental properties of the dynamical system representing the combined agent-environment system. This opens the door for designing practical artificial, intrinsically motivated controllers and for linking animal behaviors to their dynamical properties.

LGOct 5, 2023
Deep Variational Multivariate Information Bottleneck -- A Framework for Variational Losses

Eslam Abdelaleem, Ilya Nemenman, K. Michael Martini

Variational dimensionality reduction methods are widely used for their accuracy, generative capabilities, and robustness. We introduce a unifying framework that generalizes both such as traditional and state-of-the-art methods. The framework is based on an interpretation of the multivariate information bottleneck, trading off the information preserved in an encoder graph (defining what to compress) against that in a decoder graph (defining a generative model for data). Using this approach, we rederive existing methods, including the deep variational information bottleneck, variational autoencoders, and deep multiview information bottleneck. We naturally extend the deep variational CCA (DVCCA) family to beta-DVCCA and introduce a new method, the deep variational symmetric information bottleneck (DVSIB). DSIB, the deterministic limit of DVSIB, connects to modern contrastive learning approaches such as Barlow Twins, among others. We evaluate these methods on Noisy MNIST and Noisy CIFAR-100, showing that algorithms better matched to the structure of the problem like DVSIB and beta-DVCCA produce better latent spaces as measured by classification accuracy, dimensionality of the latent variables, sample efficiency, and consistently outperform other approaches under comparable conditions. Additionally, we benchmark against state-of-the-art models, achieving superior or competitive accuracy. Our results demonstrate that this framework can seamlessly incorporate diverse multi-view representation learning algorithms, providing a foundation for designing novel, problem-specific loss functions.

AIApr 22
Multi-Agent Empowerment and Emergence of Complex Behavior in Groups

Tristan Shah, Ilya Nemenman, Daniel Polani et al.

Intrinsic motivations are receiving increasing attention, i.e. behavioral incentives that are not engineered, but emerge from the interaction of an agent with its surroundings. In this work we study the emergence of behaviors driven by one such incentive, empowerment, specifically in the context of more than one agent. We formulate a principled extension of empowerment to the multi-agent setting, and demonstrate its efficient calculation. We observe that this intrinsic motivation gives rise to characteristic modes of group-organization in two qualitatively distinct environments: a pair of agents coupled by a tendon, and a controllable Vicsek flock. This demonstrates the potential of intrinsic motivations such as empowerment to not just drive behavior for only individual agents but also higher levels of behavioral organization at scale.

PLASM-PHOct 8, 2023
Physics-tailored machine learning reveals unexpected physics in dusty plasmas

Wentao Yu, Eslam Abdelaleem, Ilya Nemenman et al.

Dusty plasma is a mixture of ions, electrons, and macroscopic charged particles that is commonly found in space and planetary environments. The particles interact through Coulomb forces mediated by the surrounding plasma, and as a result, the effective forces between particles can be non-conservative and non-reciprocal. Machine learning (ML) models are a promising route to learn these complex forces, yet their structure should match the underlying physical constraints to provide useful insight. Here we demonstrate and experimentally validate an ML approach that incorporates physical intuition to infer force laws in a laboratory dusty plasma. Trained on 3D particle trajectories, the model accounts for inherent symmetries, non-identical particles, and learns the effective non-reciprocal forces between particles with exquisite accuracy (R^2>0.99). We validate the model by inferring particle masses in two independent yet consistent ways. The model's accuracy enables precise measurements of particle charge and screening length, discovering large deviations from common theoretical assumptions. Our ability to identify new physics from experimental data demonstrates how ML-powered approaches can guide new routes of scientific discovery in many-body systems. Furthermore, we anticipate our ML approach to be a starting point for inferring laws from dynamics in a wide range of many-body systems, from colloids to living organisms.

ITSep 11, 2023
Data efficiency, dimensionality reduction, and the generalized symmetric information bottleneck

K. Michael Martini, Ilya Nemenman

The Symmetric Information Bottleneck (SIB), an extension of the more familiar Information Bottleneck, is a dimensionality reduction technique that simultaneously compresses two random variables to preserve information between their compressed versions. We introduce the Generalized Symmetric Information Bottleneck (GSIB), which explores different functional forms of the cost of such simultaneous reduction. We then explore the dataset size requirements of such simultaneous compression. We do this by deriving bounds and root-mean-squared estimates of statistical fluctuations of the involved loss functions. We show that, in typical situations, the simultaneous GSIB compression requires qualitatively less data to achieve the same errors compared to compressing variables one at a time. We suggest that this is an example of a more general principle that simultaneous compression is more data efficient than independent compression of each of the input variables.

DATA-ANApr 27
Information bottleneck for learning the phase space of dynamics from high-dimensional experimental data

K. Michael Martini, Eslam Abdelaleem, Paarth Gulati et al.

Identifying the dynamical state variables of a system from high-dimensional observations is a central problem across physical sciences. The challenge is that the state variables are not directly observable and must be inferred from raw high-dimensional data without supervision. Here we introduce DySIB (Dynamical Symmetric Information Bottleneck) as a method to learn low-dimensional representations of time-series data by maximizing predictive mutual information between past and future observation windows while penalizing representation complexity. This objective operates entirely in latent space and avoids reconstruction of the observations. We apply DySIB to an experimental video dataset of a physical pendulum, where the underlying state space is known. The method, with hyperparameters of the learning architecture set self-consistently by the data, recovers a two-dimensional representation that matches the dimensionality, topology, and geometry of the pendulum phase space, with the learned coordinates aligning smoothly with the canonical angle and angular velocity. These results demonstrate, on a well-characterized experimental system, that predictive information in latent space can be used to recover interpretable dynamical coordinates directly from high-dimensional data.

DIS-NNJul 29, 2025
Better Together: Cross and Joint Covariances Enhance Signal Detectability in Undersampled Data

Arabind Swain, Sean Alexander Ridout, Ilya Nemenman

Many data-science applications involve detecting a shared signal between two high-dimensional variables. Using random matrix theory methods, we determine when such signal can be detected and reconstructed from sample correlations, despite the background of sampling noise induced correlations. We consider three different covariance matrices constructed from two high-dimensional variables: their individual self covariance, their cross covariance, and the self covariance of the concatenated (joint) variable, which incorporates the self and the cross correlation blocks. We observe the expected Baik, Ben Arous, and Péché detectability phase transition in all these covariance matrices, and we show that joint and cross covariance matrices always reconstruct the shared signal earlier than the self covariances. Whether the joint or the cross approach is better depends on the mismatch of dimensionalities between the variables. We discuss what these observations mean for choosing the right method for detecting linear correlations in data and how these findings may generalize to nonlinear statistical dependencies.

DATA-ANMay 7, 2023
Inferring Local Structure from Pairwise Correlations

Mahajabin Rahman, Ilya Nemenman

To construct models of large, multivariate complex systems, such as those in biology, one needs to constrain which variables are allowed to interact. This can be viewed as detecting "local" structures among the variables. In the context of a simple toy model of 2D natural and synthetic images, we show that pairwise correlations between the variables -- even when severely undersampled -- provide enough information to recover local relations, including the dimensionality of the data, and to reconstruct arrangement of pixels in fully scrambled images. This proves to be successful even though higher order interaction structures are present in our data. We build intuition behind the success, which we hope might contribute to modeling complex, multivariate systems and to explaining the success of modern attention-based machine learning approaches.

NCSep 25, 2018
Automated, predictive, and interpretable inference of C. elegans escape dynamics

Bryan C. Daniels, William S. Ryu, Ilya Nemenman

The roundworm C. elegans exhibits robust escape behavior in response to rapidly rising temperature. The behavior lasts for a few seconds, shows history dependence, involves both sensory and motor systems, and is too complicated to model mechanistically using currently available knowledge. Instead we model the process phenomenologically, and we use the Sir Isaac dynamical inference platform to infer the model in a fully automated fashion directly from experimental data. The inferred model requires incorporation of an unobserved dynamical variable, and is biologically interpretable. The model makes accurate predictions about the dynamics of the worm behavior, and it can be used to characterize the functional logic of the dynamical system underlying the escape response. This work illustrates the power of modern artificial intelligence to aid in discovery of accurate and interpretable models of complex natural systems.

QMApr 24, 2014
Automated adaptive inference of coarse-grained dynamical models in systems biology

Bryan C. Daniels, Ilya Nemenman

Cellular regulatory dynamics is driven by large and intricate networks of interactions at the molecular scale, whose sheer size obfuscates understanding. In light of limited experimental data, many parameters of such dynamics are unknown, and thus models built on the detailed, mechanistic viewpoint overfit and are not predictive. At the other extreme, simple ad hoc models of complex processes often miss defining features of the underlying systems. Here we propose an approach that instead constructs phenomenological, coarse-grained models of network dynamics that automatically adapt their complexity to the amount of available data. Such adaptive models lead to accurate predictions even when microscopic details of the studied systems are unknown due to insufficient data. The approach is computationally tractable, even for a relatively large number of dynamical variables, allowing its software realization, named Sir Isaac, to make successful predictions even when important dynamic variables are unobserved. For example, it matches the known phase space structure for simulated planetary motion data, avoids overfitting in a complex biological signaling system, and produces accurate predictions for a yeast glycolysis model with only tens of data points and over half of the interacting species unobserved.

NCOct 4, 2013
Director Field Model of the Primary Visual Cortex for Contour Detection

Vijay Singh, Martin Tchernookov, Rebecca Butterfield et al.

We aim to build the simplest possible model capable of detecting long, noisy contours in a cluttered visual scene. For this, we model the neural dynamics in the primate primary visual cortex in terms of a continuous director field that describes the average rate and the average orientational preference of active neurons at a particular point in the cortex. We then use a linear-nonlinear dynamical model with long range connectivity patterns to enforce long-range statistical context present in the analyzed images. The resulting model has substantially fewer degrees of freedom than traditional models, and yet it can distinguish large contiguous objects from the background clutter by suppressing the clutter and by filling-in occluded elements of object contours. This results in high-precision, high-recall detection of large objects in cluttered scenes. Parenthetically, our model has a direct correspondence with the Landau - de Gennes theory of nematic liquid crystal in two dimensions.