Pramod Gupta

LG
h-index117
6papers
3,333citations
Novelty52%
AI Score44

6 Papers

CLJul 7, 2025
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

Gheorghe Comanici, Eric Bieber, Mike Schaekermann et al. · amazon-science, baidu

In this report, we introduce the Gemini 2.X model family: Gemini 2.5 Pro and Gemini 2.5 Flash, as well as our earlier Gemini 2.0 Flash and Flash-Lite models. Gemini 2.5 Pro is our most capable model yet, achieving SoTA performance on frontier coding and reasoning benchmarks. In addition to its incredible coding and reasoning skills, Gemini 2.5 Pro is a thinking model that excels at multimodal understanding and it is now able to process up to 3 hours of video content. Its unique combination of long context, multimodal and reasoning capabilities can be combined to unlock new agentic workflows. Gemini 2.5 Flash provides excellent reasoning abilities at a fraction of the compute and latency requirements and Gemini 2.0 Flash and Flash-Lite provide high performance at low latency and cost. Taken together, the Gemini 2.X model generation spans the full Pareto frontier of model capability vs cost, allowing users to explore the boundaries of what is possible with complex agentic problem solving.

LGOct 15, 2025
An Operational Deep Learning System for Satellite-Based High-Resolution Global Nowcasting

Shreya Agrawal, Mohammed Alewi Hassen, Emmanuel Asiedu Brempong et al.

Precipitation nowcasting, which predicts rainfall up to a few hours ahead, is a critical tool for vulnerable communities in the Global South frequently exposed to intense, rapidly developing storms. Timely forecasts provide a crucial window to protect lives and livelihoods. Traditional numerical weather prediction (NWP) methods suffer from high latency, low spatial and temporal resolution, and significant gaps in accuracy across the world. Recent machine learning-based nowcasting methods, common in the Global North, cannot be extended to the Global South due to extremely sparse radar coverage. We present Global MetNet, an operational global machine learning nowcasting model. It leverages the Global Precipitation Mission's CORRA dataset, geostationary satellite data, and global NWP data to predict precipitation for the next 12 hours. The model operates at a high resolution of approximately 0.05° (~5km) spatially and 15 minutes temporally. Global MetNet significantly outperforms industry-standard hourly forecasts and achieves significantly higher skill, making forecasts useful over a much larger area of the world than previously available. Our model demonstrates better skill in data-sparse regions than even the best high-resolution NWP models achieve in the US. Validated using ground radar and satellite data, it shows significant improvements across key metrics like the critical success index and fractions skill score for all precipitation rates and lead times. Crucially, our model generates forecasts in under a minute, making it readily deployable for real-time applications. It is already deployed for millions of users on Google Search. This work represents a key step in reducing global disparities in forecast quality and integrating sparse, high-resolution satellite observations into weather forecasting.

LGNov 14, 2020
Using Convolutional Variational Autoencoders to Predict Post-Trauma Health Outcomes from Actigraphy Data

Ayse S. Cakmak, Nina Thigpen, Garrett Honke et al.

Depression and post-traumatic stress disorder (PTSD) are psychiatric conditions commonly associated with experiencing a traumatic event. Estimating mental health status through non-invasive techniques such as activity-based algorithms can help to identify successful early interventions. In this work, we used locomotor activity captured from 1113 individuals who wore a research grade smartwatch post-trauma. A convolutional variational autoencoder (VAE) architecture was used for unsupervised feature extraction from four weeks of actigraphy data. By using VAE latent variables and the participant's pre-trauma physical health status as features, a logistic regression classifier achieved an area under the receiver operating characteristic curve (AUC) of 0.64 to estimate mental health outcomes. The results indicate that the VAE model is a promising approach for actigraphy data analysis for mental health outcomes in long-term studies.

LGOct 28, 2020
Representation learning for improved interpretability and classification accuracy of clinical factors from EEG

Garrett Honke, Irina Higgins, Nina Thigpen et al.

Despite extensive standardization, diagnostic interviews for mental health disorders encompass substantial subjective judgment. Previous studies have demonstrated that EEG-based neural measures can function as reliable objective correlates of depression, or even predictors of depression and its course. However, their clinical utility has not been fully realized because of 1) the lack of automated ways to deal with the inherent noise associated with EEG data at scale, and 2) the lack of knowledge of which aspects of the EEG signal may be markers of a clinical disorder. Here we adapt an unsupervised pipeline from the recent deep representation learning literature to address these problems by 1) learning a disentangled representation using $β$-VAE to denoise the signal, and 2) extracting interpretable features associated with a sparse set of clinical labels using a Symbol-Concept Association Network (SCAN). We demonstrate that our method is able to outperform the canonical hand-engineered baseline classification method on a number of factors, including participant age and depression diagnosis. Furthermore, our method recovers a representation that can be used to automatically extract denoised Event Related Potentials (ERPs) from novel, single EEG trajectories, and supports fast supervised re-mapping to various clinical labels, allowing clinicians to re-use a single EEG representation regardless of updates to the standardized diagnostic system. Finally, single factors of the learned disentangled representations often correspond to meaningful markers of clinical factors, as automatically detected by SCAN, allowing for human interpretability and post-hoc expert analysis of the recommendations made by the model.

AIFeb 5, 2019
A Generalized Framework for Population Based Training

Ang Li, Aleksandra Spyra, Sagi Perel et al.

Population Based Training (PBT) is a recent approach that jointly optimizes neural network weights and hyperparameters which periodically copies weights of the best performers and mutates hyperparameters during training. Previous PBT implementations have been synchronized glass-box systems. We propose a general, black-box PBT framework that distributes many asynchronous "trials" (a small number of training steps with warm-starting) across a cluster, coordinated by the PBT controller. The black-box design does not make assumptions on model architectures, loss functions or training procedures. Our system supports dynamic hyperparameter schedules to optimize both differentiable and non-differentiable metrics. We apply our system to train a state-of-the-art WaveNet generative model for human voice synthesis. We show that our PBT system achieves better accuracy, less sensitivity and faster convergence compared to existing methods, given the same computational resource.

LGJan 1, 2014
Robust Hierarchical Clustering

Maria-Florina Balcan, Yingyu Liang, Pramod Gupta

One of the most widely used techniques for data clustering is agglomerative clustering. Such algorithms have been long used across many different fields ranging from computational biology to social sciences to computer vision in part because their output is easy to interpret. Unfortunately, it is well known, however, that many of the classic agglomerative clustering algorithms are not robust to noise. In this paper we propose and analyze a new robust algorithm for bottom-up agglomerative clustering. We show that our algorithm can be used to cluster accurately in cases where the data satisfies a number of natural properties and where the traditional agglomerative algorithms fail. We also show how to adapt our algorithm to the inductive setting where our given data is only a small random sample of the entire data set. Experimental evaluations on synthetic and real world data sets show that our algorithm achieves better performance than other hierarchical algorithms in the presence of noise.