Uday Kiran Reddy Tadipatri

LG
h-index19
4papers
5citations
Novelty50%
AI Score39

4 Papers

64.1SYApr 1
BLISS: Global Blind Identification of Linear Systems with Sparse Inputs

Kyle Poe, Uday Kiran Reddy Tadipatri, Benjamin D. Haeffele et al.

Linear system identification and sparse dictionary learning can both be seen as structured matrix factorization problems. However, these two problems have historically been studied in isolation by the systems theory and machine learning communities. Although linear system identification enjoys a mature theory when inputs are known, blind linear system identification remains poorly understood beyond restrictive settings. In contrast, complete sparse dictionary learning has recently benefited from strong global identifiability results and scalable nonconvex algorithms. In this work, we bridge these two areas by showing that under a sparse input assumption, fully observed blind system identification becomes a generalization of complete dictionary learning. This connection allows us to develop global identifiability guarantees for blind system identification, by leveraging techniques from the complete dictionary learning literature. We further show empirically that a principled application of the alternating direction method of multipliers can globally recover the ground-truth system from a single trajectory, provided sufficient samples and input sparsity.

20.8LGApr 19
Recovery Guarantees for Continual Learning of Dependent Tasks: Memory, Data-Dependent Regularization, and Data-Dependent Weights

Liangzu Peng, Uday Kiran Reddy Tadipatri, Ziqing Xu et al.

Continual learning (CL) is concerned with learning multiple tasks sequentially without forgetting previously learned tasks. Despite substantial empirical advances over recent years, the theoretical development of CL remains in its infancy. At the heart of developing CL theory lies the challenge that the data distribution varies across tasks, and we argue that properly addressing this challenge requires understanding this variation--dependency among tasks. To explicitly model task dependency, we consider nonlinear regression tasks and propose the assumption that these tasks are dependent in such a way that the data of the current task is a nonlinear transformation of previous data. With this model and under natural assumptions, we prove statistical recovery guarantees (more specifically, bounds on estimation errors) for several CL paradigms in practical use, including experience replay with data-independent regularization and data-independent weights that balance the losses of tasks, replay with data-dependent weights, and continual learning with data-dependent regularization (e.g., knowledge distillation). To the best of our knowledge, our bounds are informative in cases where prior work gives vacuous bounds.

LGNov 5, 2024
A Convex Relaxation Approach to Generalization Analysis for Parallel Positively Homogeneous Networks

Uday Kiran Reddy Tadipatri, Benjamin D. Haeffele, Joshua Agterberg et al.

We propose a general framework for deriving generalization bounds for parallel positively homogeneous neural networks--a class of neural networks whose input-output map decomposes as the sum of positively homogeneous maps. Examples of such networks include matrix factorization and sensing, single-layer multi-head attention mechanisms, tensor factorization, deep linear and ReLU networks, and more. Our general framework is based on linking the non-convex empirical risk minimization (ERM) problem to a closely related convex optimization problem over prediction functions, which provides a global, achievable lower-bound to the ERM problem. We exploit this convex lower-bound to perform generalization analysis in the convex space while controlling the discrepancy between the convex model and its non-convex counterpart. We apply our general framework to a wide variety of models ranging from low-rank matrix sensing, to structured matrix sensing, two-layer linear networks, two-layer ReLU networks, and single-layer multi-head attention mechanisms, achieving generalization bounds with a sample complexity that scales almost linearly with the network width.

SYApr 26, 2025
Nonconvex Linear System Identification with Minimal State Representation

Uday Kiran Reddy Tadipatri, Benjamin D. Haeffele, Joshua Agterberg et al.

Low-order linear System IDentification (SysID) addresses the challenge of estimating the parameters of a linear dynamical system from finite samples of observations and control inputs with minimal state representation. Traditional approaches often utilize Hankel-rank minimization, which relies on convex relaxations that can require numerous, costly singular value decompositions (SVDs) to optimize. In this work, we propose two nonconvex reformulations to tackle low-order SysID (i) Burer-Monterio (BM) factorization of the Hankel matrix for efficient nuclear norm minimization, and (ii) optimizing directly over system parameters for real, diagonalizable systems with an atomic norm style decomposition. These reformulations circumvent the need for repeated heavy SVD computations, significantly improving computational efficiency. Moreover, we prove that optimizing directly over the system parameters yields lower statistical error rates, and lower sample complexities that do not scale linearly with trajectory length like in Hankel-nuclear norm minimization. Additionally, while our proposed formulations are nonconvex, we provide theoretical guarantees of achieving global optimality in polynomial time. Finally, we demonstrate algorithms that solve these nonconvex programs and validate our theoretical claims on synthetic data.