Shafayeth Jamil

LG
4papers
2citations
Novelty68%
AI Score51

4 Papers

21.5LGMar 28
Interpretable Physics Extraction from Data for Linear Dynamical Systems using Lie Generator Networks

Shafayeth Jamil, Rehan Kapadia

When the system is linear, why should learning be nonlinear? Linear dynamical systems, the analytical backbone of control theory, signal processing and circuit analysis, have exact closed-form solutions via the state transition matrix. Yet when system parameters must be inferred from data, recent neural approaches offer flexibility at the cost of physical guarantees: Neural ODEs provide flexible trajectory approximation but may violate physical invariants, while energy preserving architectures do not natively represent dissipation essential to real-world systems. We introduce Lie Generator Networks (LGN), which learn a structured generator A and compute trajectories directly via matrix exponentiation. This shift from integration to exponentiation preserves structure by construction. By parameterizing A = S - D (skew-symmetric minus positive diagonal), stability and dissipation emerge from the underlying architecture and are not introduced during training via the loss function. LGN provides a unified framework for linear conservative, dissipative, and time-varying systems. On a 100-dimensional stable RLC ladder, standard derivative-based least-squares system identification can yield unstable eigenvalues. The unconstrained LGN yields stable but physically incorrect spectra, whereas LGN-SD recovers all 100 eigenvalues with over two orders of magnitude lower mean eigenvalue error than unconstrained alternatives. Critically, these eigenvalues reveal poles, natural frequencies, and damping ratios which are interpretable physics that black-box networks do not provide.

54.2SYMay 14
Lie Generator Networks Extract EIS-Grade Battery Diagnostics from Pulse Relaxation Data

Shafayeth Jamil, Rehan Kapadia

Electrochemical impedance spectroscopy (EIS) is the most informative diagnostic for lithium-ion batteries: its frequency-resolved spectra decompose cell behavior into distinct electrochemical processes, revealing mechanism-specific degradation invisible to voltage and resistance measurements. Yet EIS requires dedicated hardware and minutes-long acquisitions incompatible with field deployment. Here we show that Lie Generator Networks (LGN), a structure-preserving identification framework, extract electrochemical time constants from 60 seconds of post-pulse voltage relaxation, data that battery management systems already collect, that encode the same diagnostic and prognostic information as impedance spectra. LGN learns the generator matrix of the relaxation dynamics with stability guaranteed by architecture, yielding time constants precise enough to resolve electrochemical variation that conventional curve fitting cannot detect from identical data. Across five datasets totaling over 850 cells, four institutions, and multiple chemistries, LGN tracks degradation with near-perfect rank correlation ($|ρ_s| = 0.999$), enables cross-validated reconstruction of full Nyquist spectra at 2% median error across 227 cells, predicts which capacity-matched cells fail first from three early diagnostics, and recovers Arrhenius activation energies with zero physics priors without retraining or cell-specific tuning. LGN requires no training data, no impedance hardware, and no chemistry-specific calibration, converting any existing relaxation pulse into an impedance-grade diagnostic. This enables real-time health monitoring, rapid second-life grading, production-line quality control, and physics-informed prognosis from minutes of measurement.

20.3LGMay 12
The Routing and Filtering Structure of Attention

Shafayeth Jamil, Rehan Kapadia

The attention interaction matrix $QK^{\top}$ contains two entangled computations: a skew-symmetric component that redistributes information between positions (routing) and a symmetric component that scales mutual relevance (filtering). We decompose 1776 heads across five pretrained transformers and find routing operating at low rank, well below the routing capacity allocated by the weight kernel. We introduce $S$-$D$ attention as a diagnostic parameterization that disentangles routing from filtering by construction with guaranteed stability ($\mathrm{Re}(λ) \le 0$) and trains stably without layer normalization. When disentangled and unnormalized, routing self-organizes into a spectral cascade, effective rank $2$ at the first layer, expanding with depth across six scales from 7M to 355M parameters. The cascade predicts where attention can be simplified: linearizing the first seven layers of 125M $S$-$D$ attention costs ${<}5\%$ perplexity, whereas standard attention collapses under the same intervention. The linearizable region widens with depth. Replacing the first four layers with ELU+1 linear attention reaches within $1.4\%$ of baseline at full head dimension. Cascade-allocated architectures trade attention parameters for perplexity ($47\%-65\%$ fewer attention parameters at $+3.9\%$ to $+8.4\%$ PPL). The routing-filtering decomposition makes the spectral budget legible; the cascade makes it actionable.

40.8LGApr 1
Lie Generator Networks for Nonlinear Partial Differential Equations

Shafayeth Jamil, Rehan Kapadia

Linear dynamical systems are fully characterized by their eigenspectra, accessible directly from the generator of the dynamics. For nonlinear systems governed by partial differential equations, no equivalent theory exists. We introduce Lie Generator Network-Koopman (LGN-KM), a neural operator that lifts nonlinear dynamics into a linear latent space and learns the continuous-time Koopman generator ($L_k$) through a decomposition $L_k = S - D_k$, where $S$ is skew-symmetric representing conservative inter-modal coupling, and $D_k$ is a positive-definite diagonal encoding modal dissipation. This architectural decomposition enforces stability and enables interpretability through direct spectral access to the learned dynamics. On two-dimensional Navier--Stokes turbulence, the generator recovers the known dissipation scaling and a complete multi-branch dispersion relation from trajectory data alone with no physics supervision. Independently trained models at different flow regimes recover matched gauge-invariant spectral structure, exposing a gauge freedom in the Koopman lifting. Because the generator is provably stable, it enables guaranteed long-horizon stability, continuous-time evaluation at arbitrary time, and physics-informed cross-viscosity model transfer.