Shivam Gupta

h-index65

15papers

148citations

Novelty45%

AI Score38

Ranked #85,849 of 194,257 authors (top 44%)#19,041 in LG (top 47%)

15 Papers

5.1STJun 6, 2022

Finite-Sample Maximum Likelihood Estimation of Location

Shivam Gupta, Jasper C. H. Lee, Eric Price et al.

We consider 1-dimensional location estimation, where we estimate a parameter $λ$ from $n$ samples $λ+ η_i$, with each $η_i$ drawn i.i.d. from a known distribution $f$. For fixed $f$ the maximum-likelihood estimate (MLE) is well-known to be optimal in the limit as $n \to \infty$: it is asymptotically normal with variance matching the Cramér-Rao lower bound of $\frac{1}{n\mathcal{I}}$, where $\mathcal{I}$ is the Fisher information of $f$. However, this bound does not hold for finite $n$, or when $f$ varies with $n$. We show for arbitrary $f$ and $n$ that one can recover a similar theory based on the Fisher information of a smoothed version of $f$, where the smoothing radius decays with $n$.

16.5LGNov 23, 2023

Improved Sample Complexity Bounds for Diffusion Model Training

Shivam Gupta, Aditya Parulekar, Eric Price et al.

Diffusion models have become the most popular approach to deep generative modeling of images, largely due to their empirical performance and reliability. From a theoretical standpoint, a number of recent works have studied the iteration complexity of sampling, assuming access to an accurate diffusion model. In this work, we focus on understanding the sample complexity of training such a model; how many samples are needed to learn an accurate diffusion model using a sufficiently expressive neural network? Prior work showed bounds polynomial in the dimension, desired Total Variation error, and Wasserstein error. We show an exponential improvement in the dependence on Wasserstein error and depth, along with improved dependencies on other relevant parameters.

3.3STJun 28, 2023

Finite-Sample Symmetric Mean Estimation with Fisher Information Rate

Shivam Gupta, Jasper C. H. Lee, Eric Price

The mean of an unknown variance-$σ^2$ distribution $f$ can be estimated from $n$ samples with variance $\frac{σ^2}{n}$ and nearly corresponding subgaussian rate. When $f$ is known up to translation, this can be improved asymptotically to $\frac{1}{n\mathcal I}$, where $\mathcal I$ is the Fisher information of the distribution. Such an improvement is not possible for general unknown $f$, but [Stone, 1975] showed that this asymptotic convergence $\textit{is}$ possible if $f$ is $\textit{symmetric}$ about its mean. Stone's bound is asymptotic, however: the $n$ required for convergence depends in an unspecified way on the distribution $f$ and failure probability $δ$. In this paper we give finite-sample guarantees for symmetric mean estimation in terms of Fisher information. For every $f, n, δ$ with $n > \log \frac{1}δ$, we get convergence close to a subgaussian with variance $\frac{1}{n \mathcal I_r}$, where $\mathcal I_r$ is the $r$-$\textit{smoothed}$ Fisher information with smoothing radius $r$ that decays polynomially in $n$. Such a bound essentially matches the finite-sample guarantees in the known-$f$ setting.

5.1STFeb 5, 2023

High-dimensional Location Estimation via Norm Concentration for Subgamma Vectors

Shivam Gupta, Jasper C. H. Lee, Eric Price

In location estimation, we are given $n$ samples from a known distribution $f$ shifted by an unknown translation $λ$, and want to estimate $λ$ as precisely as possible. Asymptotically, the maximum likelihood estimate achieves the Cramér-Rao bound of error $\mathcal N(0, \frac{1}{n\mathcal I})$, where $\mathcal I$ is the Fisher information of $f$. However, the $n$ required for convergence depends on $f$, and may be arbitrarily large. We build on the theory using \emph{smoothed} estimators to bound the error for finite $n$ in terms of $\mathcal I_r$, the Fisher information of the $r$-smoothed distribution. As $n \to \infty$, $r \to 0$ at an explicit rate and this converges to the Cramér-Rao bound. We (1) improve the prior work for 1-dimensional $f$ to converge for constant failure probability in addition to high probability, and (2) extend the theory to high-dimensional distributions. In the process, we prove a new bound on the norm of a high-dimensional random variable whose 1-dimensional projections are subgamma, which may be of independent interest.

2.1MLJun 21, 2022

Sharp Constants in Uniformity Testing via the Huber Statistic

Shivam Gupta, Eric Price

Uniformity testing is one of the most well-studied problems in property testing, with many known test statistics, including ones based on counting collisions, singletons, and the empirical TV distance. It is known that the optimal sample complexity to distinguish the uniform distribution on $m$ elements from any $ε$-far distribution with $1-δ$ probability is $n = Θ\left(\frac{\sqrt{m \log (1/δ)}}{ε^2} + \frac{\log (1/δ)}{ε^2}\right)$, which is achieved by the empirical TV tester. Yet in simulation, these theoretical analyses are misleading: in many cases, they do not correctly rank order the performance of existing testers, even in an asymptotic regime of all parameters tending to $0$ or $\infty$. We explain this discrepancy by studying the \emph{constant factors} required by the algorithms. We show that the collisions tester achieves a sharp maximal constant in the number of standard deviations of separation between uniform and non-uniform inputs. We then introduce a new tester based on the Huber loss, and show that it not only matches this separation, but also has tails corresponding to a Gaussian with this separation. This leads to a sample complexity of $(1 + o(1))\frac{\sqrt{m \log (1/δ)}}{ε^2}$ in the regime where this term is dominant, unlike all other existing testers.

4.6LGJul 5, 2024Code

Fair Federated Data Clustering through Personalization: Bridging the Gap between Diverse Data Distributions

Shivam Gupta, Tarushi, Tsering Wangzes et al.

The rapid growth of data from edge devices has catalyzed the performance of machine learning algorithms. However, the data generated resides at client devices thus there are majorly two challenge faced by traditional machine learning paradigms - centralization of data for training and secondly for most the generated data the class labels are missing and there is very poor incentives to clients to manually label their data owing to high cost and lack of expertise. To overcome these issues, there have been initial attempts to handle unlabelled data in a privacy preserving distributed manner using unsupervised federated data clustering. The goal is partition the data available on clients into $k$ partitions (called clusters) without actual exchange of data. Most of the existing algorithms are highly dependent on data distribution patterns across clients or are computationally expensive. Furthermore, due to presence of skewed nature of data across clients in most of practical scenarios existing models might result in clients suffering high clustering cost making them reluctant to participate in federated process. To this, we are first to introduce the idea of personalization in federated clustering. The goal is achieve balance between achieving lower clustering cost and at same time achieving uniform cost across clients. We propose p-FClus that addresses these goal in a single round of communication between server and clients. We validate the efficacy of p-FClus against variety of federated datasets showcasing it's data independence nature, applicability to any finite $\ell$-norm, while simultaneously achieving lower cost and variance.

14.4LGOct 30, 2025

Posterior Sampling by Combining Diffusion Models with Annealed Langevin Dynamics

Zhiyang Xun, Shivam Gupta, Eric Price

Given a noisy linear measurement $y = Ax + ξ$ of a distribution $p(x)$, and a good approximation to the prior $p(x)$, when can we sample from the posterior $p(x \mid y)$? Posterior sampling provides an accurate and fair framework for tasks such as inpainting, deblurring, and MRI reconstruction, and several heuristics attempt to approximate it. Unfortunately, approximate posterior sampling is computationally intractable in general. To sidestep this hardness, we focus on (local or global) log-concave distributions $p(x)$. In this regime, Langevin dynamics yields posterior samples when the exact scores of $p(x)$ are available, but it is brittle to score--estimation error, requiring an MGF bound (sub-exponential error). By contrast, in the unconditional setting, diffusion models succeed with only an $L^2$ bound on the score error. We prove that combining diffusion models with an annealed variant of Langevin dynamics achieves conditional sampling in polynomial time using merely an $L^4$ bound on the score error.

20.3LGFeb 20, 2024

Diffusion Posterior Sampling is Computationally Intractable

Shivam Gupta, Ajil Jalal, Aditya Parulekar et al.

Diffusion models are a remarkably effective way of learning and sampling from a distribution $p(x)$. In posterior sampling, one is also given a measurement model $p(y \mid x)$ and a measurement $y$, and would like to sample from $p(x \mid y)$. Posterior sampling is useful for tasks such as inpainting, super-resolution, and MRI reconstruction, so a number of recent works have given algorithms to heuristically approximate it; but none are known to converge to the correct distribution in polynomial time. In this paper we show that posterior sampling is computationally intractable: under the most basic assumption in cryptography -- that one-way functions exist -- there are instances for which every algorithm takes superpolynomial time, even though unconditional sampling is provably fast. We also show that the exponential-time rejection sampling algorithm is essentially optimal under the stronger plausible assumption that there are one-way functions that take exponential time to invert.

22.4LGJun 3, 2024

Faster Diffusion Sampling with Randomized Midpoints: Sequential and Parallel

Shivam Gupta, Linda Cai, Sitan Chen

Sampling algorithms play an important role in controlling the quality and runtime of diffusion model inference. In recent years, a number of works~\cite{chen2023sampling,chen2023ode,benton2023error,lee2022convergence} have proposed schemes for diffusion sampling with provable guarantees; these works show that for essentially any data distribution, one can approximately sample in polynomial time given a sufficiently accurate estimate of its score functions at different noise levels. In this work, we propose a new scheme inspired by Shen and Lee's randomized midpoint method for log-concave sampling~\cite{ShenL19}. We prove that this approach achieves the best known dimension dependence for sampling from arbitrary smooth distributions in total variation distance ($\widetilde O(d^{5/12})$ compared to $\widetilde O(\sqrt{d})$ from prior work). We also show that our algorithm can be parallelized to run in only $\widetilde O(\log^2 d)$ parallel rounds, constituting the first provable guarantees for parallel sampling with diffusion models. As a byproduct of our methods, for the well-studied problem of log-concave sampling in total variation distance, we give an algorithm and simple analysis achieving dimension dependence $\widetilde O(d^{5/12})$ compared to $\widetilde O(\sqrt{d})$ from prior work.

4.5AIFeb 6, 2022

An Empirical Analysis of AI Contributions to Sustainable Cities (SDG11)

Shivam Gupta, Auriol Degbelo

Artificial Intelligence (AI) presents opportunities to develop tools and techniques for addressing some of the major global challenges and deliver solutions with significant social and economic impacts. The application of AI has far-reaching implications for the 17 Sustainable Development Goals (SDGs) in general, and sustainable urban development in particular. However, existing attempts to understand and use the opportunities offered by AI for SDG 11 have been explored sparsely, and the shortage of empirical evidence about the practical application of AI remains. In this chapter, we analyze the contribution of AI to support the progress of SDG 11 (Sustainable Cities and Communities). We address the knowledge gap by empirically analyzing the AI systems (N = 29) from the AIxSDG database and the Community Research and Development Information Service (CORDIS) database. Our analysis revealed that AI systems have indeed contributed to advancing sustainable cities in several ways (e.g., waste management, air quality monitoring, disaster response management, transportation management), but many projects are still working for citizens and not with them. This snapshot of AI's impact on SDG11 is inherently partial, yet useful to advance our understanding as we move towards more mature systems and research on the impact of AI systems for social good.

2.9CRJan 16, 2022

Improving Privacy and Security in Unmanned Aerial Vehicles Network using Blockchain

Hardik Sachdeva, Shivam Gupta, Anushka Misra et al.

Unmanned Aerial Vehicles (UAVs), also known as drones, have exploded in every segment present in todays business industry. They have scope in reinventing old businesses, and they are even developing new opportunities for various brands and franchisors. UAVs are used in the supply chain, maintaining surveillance and serving as mobile hotspots. Although UAVs have potential applications, they bring several societal concerns and challenges that need addressing in public safety, privacy, and cyber security. UAVs are prone to various cyber-attacks and vulnerabilities; they can also be hacked and misused by malicious entities resulting in cyber-crime. The adversaries can exploit these vulnerabilities, leading to data loss, property, and destruction of life. One can partially detect the attacks like false information dissemination, jamming, gray hole, blackhole, and GPS spoofing by monitoring the UAV behavior, but it may not resolve privacy issues. This paper presents secure communication between UAVs using blockchain technology. Our approach involves building smart contracts and making a secure and reliable UAV adhoc network. This network will be resilient to various network attacks and is secure against malicious intrusions.

1.6LGDec 26, 2021

Recent Trends in Artificial Intelligence-inspired Electronic Thermal Management

Aviral Chharia, Nishi Mehta, Shivam Gupta et al.

The rise of computation-based methods in thermal management has gained immense attention in recent years due to the ability of deep learning to solve complex 'physics' problems, which are otherwise difficult to be approached using conventional techniques. Thermal management is required in electronic systems to keep them from overheating and burning, enhancing their efficiency and lifespan. For a long time, numerical techniques have been employed to aid in the thermal management of electronics. However, they come with some limitations. To increase the effectiveness of traditional numerical approaches and address the drawbacks faced in conventional approaches, researchers have looked at using artificial intelligence at various stages of the thermal management process. The present study discusses in detail, the current uses of deep learning in the domain of 'electronic' thermal management.

3.1LGNov 5, 2021

EpilNet: A Novel Approach to IoT based Epileptic Seizure Prediction and Diagnosis System using Artificial Intelligence

Shivam Gupta, Virender Ranga, Priyansh Agrawal

Epilepsy is one of the most occurring neurological diseases. The main characteristic of this disease is a frequent seizure, which is an electrical imbalance in the brain. It is generally accompanied by shaking of body parts and even leads (fainting). In the past few years, many treatments have come up. These mainly involve the use of anti-seizure drugs for controlling seizures. But in 70% of cases, these drugs are not effective, and surgery is the only solution when the condition worsens. So patients need to take care of themselves while having a seizure and be safe. Wearable electroencephalogram (EEG) devices have come up with the development in medical science and technology. These devices help in the analysis of brain electrical activities. EEG helps in locating the affected cortical region. The most important is that it can predict any seizure in advance on-site. This has resulted in a sudden increase in demand for effective and efficient seizure prediction and diagnosis systems. A novel approach to epileptic seizure prediction and diagnosis system EpilNet is proposed in the present paper. It is a one-dimensional (1D) convolution neural network. EpilNet gives the testing accuracy of 79.13% for five classes, leading to a significant increase of about 6-7% compared to related works. The developed Web API helps in bringing EpilNet into practical use. Thus, it is an integrated system for both patients and doctors. The system will help patients prevent injury or accidents and increase the efficiency of the treatment process by doctors in the hospitals.

11.9LGSep 23, 2021Code

Outlier-Robust Sparse Estimation via Non-Convex Optimization

Yu Cheng, Ilias Diakonikolas, Rong Ge et al.

We explore the connection between outlier-robust high-dimensional statistics and non-convex optimization in the presence of sparsity constraints, with a focus on the fundamental tasks of robust sparse mean estimation and robust sparse PCA. We develop novel and simple optimization formulations for these problems such that any approximate stationary point of the associated optimization problem yields a near-optimal solution for the underlying robust estimation task. As a corollary, we obtain that any first-order method that efficiently converges to stationarity yields an efficient algorithm for these tasks. The obtained algorithms are simple, practical, and succeed under broader distributional assumptions compared to prior work.

1.2CYJul 30, 2020

AI-based Monitoring and Response System for Hospital Preparedness towards COVID-19 in Southeast Asia

Tushar Goswamy, Naishadh Parmar, Ayush Gupta et al.

This research paper proposes a COVID-19 monitoring and response system to identify the surge in the volume of patients at hospitals and shortage of critical equipment like ventilators in South-east Asian countries, to understand the burden on health facilities. This can help authorities in these regions with resource planning measures to redirect resources to the regions identified by the model. Due to the lack of publicly available data on the influx of patients in hospitals, or the shortage of equipment, ICU units or hospital beds that regions in these countries might be facing, we leverage Twitter data for gleaning this information. The approach has yielded accurate results for states in India, and we are working on validating the model for the remaining countries so that it can serve as a reliable tool for authorities to monitor the burden on hospitals.