Eslam Abdelaleem

LG
h-index3
4papers
18citations
Novelty51%
AI Score39

4 Papers

LGOct 5, 2023
Deep Variational Multivariate Information Bottleneck -- A Framework for Variational Losses

Eslam Abdelaleem, Ilya Nemenman, K. Michael Martini

Variational dimensionality reduction methods are widely used for their accuracy, generative capabilities, and robustness. We introduce a unifying framework that generalizes both such as traditional and state-of-the-art methods. The framework is based on an interpretation of the multivariate information bottleneck, trading off the information preserved in an encoder graph (defining what to compress) against that in a decoder graph (defining a generative model for data). Using this approach, we rederive existing methods, including the deep variational information bottleneck, variational autoencoders, and deep multiview information bottleneck. We naturally extend the deep variational CCA (DVCCA) family to beta-DVCCA and introduce a new method, the deep variational symmetric information bottleneck (DVSIB). DSIB, the deterministic limit of DVSIB, connects to modern contrastive learning approaches such as Barlow Twins, among others. We evaluate these methods on Noisy MNIST and Noisy CIFAR-100, showing that algorithms better matched to the structure of the problem like DVSIB and beta-DVCCA produce better latent spaces as measured by classification accuracy, dimensionality of the latent variables, sample efficiency, and consistently outperform other approaches under comparable conditions. Additionally, we benchmark against state-of-the-art models, achieving superior or competitive accuracy. Our results demonstrate that this framework can seamlessly incorporate diverse multi-view representation learning algorithms, providing a foundation for designing novel, problem-specific loss functions.

PLASM-PHOct 8, 2023
Physics-tailored machine learning reveals unexpected physics in dusty plasmas

Wentao Yu, Eslam Abdelaleem, Ilya Nemenman et al.

Dusty plasma is a mixture of ions, electrons, and macroscopic charged particles that is commonly found in space and planetary environments. The particles interact through Coulomb forces mediated by the surrounding plasma, and as a result, the effective forces between particles can be non-conservative and non-reciprocal. Machine learning (ML) models are a promising route to learn these complex forces, yet their structure should match the underlying physical constraints to provide useful insight. Here we demonstrate and experimentally validate an ML approach that incorporates physical intuition to infer force laws in a laboratory dusty plasma. Trained on 3D particle trajectories, the model accounts for inherent symmetries, non-identical particles, and learns the effective non-reciprocal forces between particles with exquisite accuracy (R^2>0.99). We validate the model by inferring particle masses in two independent yet consistent ways. The model's accuracy enables precise measurements of particle charge and screening length, discovering large deviations from common theoretical assumptions. Our ability to identify new physics from experimental data demonstrates how ML-powered approaches can guide new routes of scientific discovery in many-body systems. Furthermore, we anticipate our ML approach to be a starting point for inferring laws from dynamics in a wide range of many-body systems, from colloids to living organisms.

60.4DATA-ANApr 27
Information bottleneck for learning the phase space of dynamics from high-dimensional experimental data

K. Michael Martini, Eslam Abdelaleem, Paarth Gulati et al.

Identifying the dynamical state variables of a system from high-dimensional observations is a central problem across physical sciences. The challenge is that the state variables are not directly observable and must be inferred from raw high-dimensional data without supervision. Here we introduce DySIB (Dynamical Symmetric Information Bottleneck) as a method to learn low-dimensional representations of time-series data by maximizing predictive mutual information between past and future observation windows while penalizing representation complexity. This objective operates entirely in latent space and avoids reconstruction of the observations. We apply DySIB to an experimental video dataset of a physical pendulum, where the underlying state space is known. The method, with hyperparameters of the learning architecture set self-consistently by the data, recovers a two-dimensional representation that matches the dimensionality, topology, and geometry of the pendulum phase space, with the learned coordinates aligning smoothly with the canonical angle and angular velocity. These results demonstrate, on a well-characterized experimental system, that predictive information in latent space can be used to recover interpretable dynamical coordinates directly from high-dimensional data.

LGOct 23, 2024
Simultaneous Dimensionality Reduction for Extracting Useful Representations of Large Empirical Multimodal Datasets

Eslam Abdelaleem

The quest for simplification in physics drives the exploration of concise mathematical representations for complex systems. This Dissertation focuses on the concept of dimensionality reduction as a means to obtain low-dimensional descriptions from high-dimensional data, facilitating comprehension and analysis. We address the challenges posed by real-world data that defy conventional assumptions, such as complex interactions within neural systems or high-dimensional dynamical systems. Leveraging insights from both theoretical physics and machine learning, this work unifies diverse reduction methods under a comprehensive framework, the Deep Variational Multivariate Information Bottleneck. This framework enables the design of tailored reduction algorithms based on specific research questions. We explore and assert the efficacy of simultaneous reduction approaches over their independent reduction counterparts, demonstrating their superiority in capturing covariation between multiple modalities, while requiring less data. We also introduced novel techniques, such as the Deep Variational Symmetric Information Bottleneck, for general nonlinear simultaneous reduction. We show that the same principle of simultaneous reduction is the key to efficient estimation of mutual information. We show that our new method is able to discover the coordinates of high-dimensional observations of dynamical systems. Through analytical investigations and empirical validations, we shed light on the intricacies of dimensionality reduction methods, paving the way for enhanced data analysis across various domains. We underscore the potential of these methodologies to extract meaningful insights from complex datasets, driving advancements in fundamental research and applied sciences. As these methods evolve, they promise to deepen our understanding of complex systems and inform more effective data analysis strategies.