Jaidev Gill

SY
h-index18
6papers
17citations
Novelty53%
AI Score45

6 Papers

LGJun 13, 2023
Symmetric Neural-Collapse Representations with Supervised Contrastive Loss: The Impact of ReLU and Batching

Ganesh Ramachandra Kini, Vala Vakilian, Tina Behnia et al.

Supervised contrastive loss (SCL) is a competitive and often superior alternative to the cross-entropy loss for classification. While prior studies have demonstrated that both losses yield symmetric training representations under balanced data, this symmetry breaks under class imbalances. This paper presents an intriguing discovery: the introduction of a ReLU activation at the final layer effectively restores the symmetry in SCL-learned representations. We arrive at this finding analytically, by establishing that the global minimizers of an unconstrained features model with SCL loss and entry-wise non-negativity constraints form an orthogonal frame. Extensive experiments conducted across various datasets, architectures, and imbalance scenarios corroborate our finding. Importantly, our experiments reveal that the inclusion of the ReLU activation restores symmetry without compromising test accuracy. This constitutes the first geometry characterization of SCL under imbalances. Additionally, our analysis and experiments underscore the pivotal role of batch selection strategies in representation geometry. By proving necessary and sufficient conditions for mini-batch choices that ensure invariant symmetric representations, we introduce batch-binding as an efficient strategy that guarantees these conditions hold.

LGOct 2, 2023
Engineering the Neural Collapse Geometry of Supervised-Contrastive Loss

Jaidev Gill, Vala Vakilian, Christos Thrampoulidis

Supervised-contrastive loss (SCL) is an alternative to cross-entropy (CE) for classification tasks that makes use of similarities in the embedding space to allow for richer representations. In this work, we propose methods to engineer the geometry of these learnt feature embeddings by modifying the contrastive loss. In pursuit of adjusting the geometry we explore the impact of prototypes, fixed embeddings included during training to alter the final feature geometry. Specifically, through empirical findings, we demonstrate that the inclusion of prototypes in every batch induces the geometry of the learnt embeddings to align with that of the prototypes. We gain further insights by considering a limiting scenario where the number of prototypes far outnumber the original batch size. Through this, we establish a connection to cross-entropy (CE) loss with a fixed classifier and normalized embeddings. We validate our findings by conducting a series of experiments with deep neural networks on benchmark vision datasets.

SYMar 12
Identifying Network Structure of Nonlinear Dynamical Systems: Contraction and Kuramoto Oscillators

Jaidev Gill, Jing Shuang Li

In this work, we study the identifiability of network structures (i.e., topologies) for networked nonlinear systems when partial measurements of the nodal dynamics are taken. We explore scenarios where different candidate structures can yield similar measurements, thus limiting identifiability. To do so, we apply the contraction theory framework to facilitate comparisons between different networks. We show that semicontraction in the observable space is a sufficient condition for two systems to become indistinguishable from one another based on partial measurements. We apply this framework to study networks of Kuramoto oscillators, and discuss scenarios in which different network structures (both connected and disconnected) become indistinguishable.

SYMar 12
Identifying Network Structure of Linear Dynamical Systems: Observability and Edge Misclassification

Jaidev Gill, Jing Shuang Li

This work studies the limitations of uniquely identifying the structure (i.e., topology) of a networked linear system from partial measurements of its nodal dynamics. In general, many networks can be consistent with these measurements; this is a consideration often neglected by standard network inference methods. We show that the space of these networks are related through the nullspace of the observability matrix for the true network. We establish relevant metrics to investigate this space, including an analytic characterization of the most structurally dissimilar network that can be inferred, as well as the possibility of mis-inferring presence or absence of edges. In simulations, we find that when observing over 6\% of nodes in random network models (e.g., Erd\H os-R\' enyi and Watts-Strogatz), approximately 99\% of edges are correctly classified. Extending this discussion, we construct a family of networks that keep measurements $ε$-close to each other, and connect the identifiability of these networks to the spectral properties of an augmented observability Gramian.

SYMar 26
Firing Rate Neural Network Implementations of Model Predictive Control

Jaidev Gill, Jing Shuang Li

Human and animal brains perform planning to enable complex movements and behaviors. This process can be effectively described using model predictive control (MPC); that is, brains can be thought of as implementing some version of MPC. How is this done? In this work, we translate model predictive controllers into firing rate neural networks, offering insights into the nonlinear neural dynamics that underpin planning. This is done by first applying the projected gradient method to the dual problem, then generating alternative networks through factorization and contraction analysis. This allows us to explore many biologically plausible implementations of MPC. We present a series of numerical simulations to study different neural networks performing MPC to balance an inverted pendulum on a cart (i.e., balancing a stick on a hand). We illustrate that sparse neural networks can effectively implement MPC; this observation aligns with the sparse nature of the brain.

IVFeb 6, 2024
Deep Learning-Based Correction and Unmixing of Hyperspectral Images for Brain Tumor Surgery

David Black, Jaidev Gill, Andrew Xie et al.

Hyperspectral Imaging (HSI) for fluorescence-guided brain tumor resection enables visualization of differences between tissues that are not distinguishable to humans. This augmentation can maximize brain tumor resection, improving patient outcomes. However, much of the processing in HSI uses simplified linear methods that are unable to capture the non-linear, wavelength-dependent phenomena that must be modeled for accurate recovery of fluorophore abundances. We therefore propose two deep learning models for correction and unmixing, which can account for the nonlinear effects and produce more accurate estimates of abundances. Both models use an autoencoder-like architecture to process the captured spectra. One is trained with protoporphyrin IX (PpIX) concentration labels. The other undergoes semi-supervised training, first learning hyperspectral unmixing self-supervised and then learning to correct fluorescence emission spectra for heterogeneous optical and geometric properties using a reference white-light reflectance spectrum in a few-shot manner. The models were evaluated against phantom and pig brain data with known PpIX concentration; the supervised model achieved Pearson correlation coefficients (R values) between the known and computed PpIX concentrations of 0.997 and 0.990, respectively, whereas the classical approach achieved only 0.93 and 0.82. The semi-supervised approach's R values were 0.98 and 0.91, respectively. On human data, the semi-supervised model gives qualitatively more realistic results than the classical method, better removing bright spots of specular reflectance and reducing the variance in PpIX abundance over biopsies that should be relatively homogeneous. These results show promise for using deep learning to improve HSI in fluorescence-guided neurosurgery.