Sauptik Dhar

LG
h-index9
13papers
238citations
Novelty37%
AI Score37

13 Papers

CVOct 23, 2025
Breakdance Video classification in the age of Generative AI

Sauptik Dhar, Naveen Ramakrishnan, Michelle Munson

Large Vision Language models have seen huge application in several sports use-cases recently. Most of these works have been targeted towards a limited subset of popular sports like soccer, cricket, basketball etc; focusing on generative tasks like visual question answering, highlight generation. This work analyzes the applicability of the modern video foundation models (both encoder and decoder) for a very niche but hugely popular dance sports - breakdance. Our results show that Video Encoder models continue to outperform state-of-the-art Video Language Models for prediction tasks. We provide insights on how to choose the encoder model and provide a thorough analysis into the workings of a finetuned decoder model for breakdance video classification.

CVAug 25, 2025
Large VLM-based Stylized Sports Captioning

Sauptik Dhar, Nicholas Buoncristiani, Joe Anakata et al.

The advent of large (visual) language models (LLM / LVLM) have led to a deluge of automated human-like systems in several domains including social media content generation, search and recommendation, healthcare prognosis, AI assistants for cognitive tasks etc. Although these systems have been successfully integrated in production; very little focus has been placed on sports, particularly accurate identification and natural language description of the game play. Most existing LLM/LVLMs can explain generic sports activities, but lack sufficient domain-centric sports' jargon to create natural (human-like) descriptions. This work highlights the limitations of existing SoTA LLM/LVLMs for generating production-grade sports captions from images in a desired stylized format, and proposes a two-level fine-tuned LVLM pipeline to address that. The proposed pipeline yields an improvement > 8-10% in the F1, and > 2-10% in BERT score compared to alternative approaches. In addition, it has a small runtime memory footprint and fast execution time. During Super Bowl LIX the pipeline proved its practical application for live professional sports journalism; generating highly accurate and stylized captions at the rate of 6 images per 3-5 seconds for over 1000 images during the game play.

LGOct 11, 2021
A Survey on Proactive Customer Care: Enabling Science and Steps to Realize it

Viswanath Ganapathy, Sauptik Dhar, Olimpiya Saha et al.

In recent times, advances in artificial intelligence (AI) and IoT have enabled seamless and viable maintenance of appliances in home and building environments. Several studies have shown that AI has the potential to provide personalized customer support which could predict and avoid errors more reliably than ever before. In this paper, we have analyzed the various building blocks needed to enable a successful AI-driven predictive maintenance use-case. Unlike, existing surveys which mostly provide a deep dive into the recent AI algorithms for Predictive Maintenance (PdM), our survey provides the complete view; starting from business impact to recent technology advancements in algorithms as well as systems research and model deployment. Furthermore, we provide exemplar use-cases on predictive maintenance of appliances using publicly available data sets. Our survey can serve as a template needed to design a successful predictive maintenance use-case. Finally, we touch upon existing public data sources and provide a step-wise breakdown of an AI-driven proactive customer care (PCC) use-case, starting from generic anomaly detection to fault prediction and finally root-cause analysis. We highlight how such a step-wise approach can be advantageous for accurate model building and helpful for gaining insights into predictive maintenance of electromechanical appliances.

LGJun 18, 2021
Universum GANs: Improving GANs through contradictions

Sauptik Dhar, Javad Heydari, Samarth Tripathi et al.

Limited availability of labeled-data makes any supervised learning problem challenging. Alternative learning settings like semi-supervised and universum learning alleviate the dependency on labeled data, but still require a large amount of unlabeled data, which may be unavailable or expensive to acquire. GAN-based data generation methods have recently shown promise by generating synthetic samples to improve learning. However, most existing GAN based approaches either provide poor discriminator performance under limited labeled data settings; or results in low quality generated data. In this paper, we propose a Universum GAN game which provides improved discriminator accuracy under limited data settings, while generating high quality realistic data. We further propose an evolving discriminator loss which improves its convergence and generalization performance. We derive the theoretical guarantees and provide empirical results in support of our approach.

LGMay 17, 2021
DOC3-Deep One Class Classification using Contradictions

Sauptik Dhar, Bernardo Gonzalez Torres

This paper introduces the notion of learning from contradictions (a.k.a Universum learning) for deep one class classification problems. We formalize this notion for the widely adopted one class large-margin loss, and propose the Deep One Class Classification using Contradictions (DOC3) algorithm. We show that learning from contradictions incurs lower generalization error by comparing the Empirical Rademacher Complexity (ERC) of DOC3 against its traditional inductive learning counterpart. Our empirical results demonstrate the efficacy of DOC3 compared to popular baseline algorithms on several real-life data sets.

LGJul 27, 2020
Stabilizing Bi-Level Hyperparameter Optimization using Moreau-Yosida Regularization

Sauptik Dhar, Unmesh Kurup, Mohak Shah

This research proposes to use the Moreau-Yosida envelope to stabilize the convergence behavior of bi-level Hyperparameter optimization solvers, and introduces the new algorithm called Moreau-Yosida regularized Hyperparameter Optimization (MY-HPO) algorithm. Theoretical analysis on the correctness of the MY-HPO solution and initial convergence analysis is also provided. Our empirical results show significant improvement in loss values for a fixed computation budget, compared to the state-of-art bi-level HPO solvers.

LGNov 2, 2019
On-Device Machine Learning: An Algorithms and Learning Theory Perspective

Sauptik Dhar, Junyao Guo, Jiayi Liu et al.

The predominant paradigm for using machine learning models on a device is to train a model in the cloud and perform inference using the trained model on the device. However, with increasing number of smart devices and improved hardware, there is interest in performing model training on the device. Given this surge in interest, a comprehensive survey of the field from a device-agnostic perspective sets the stage for both understanding the state-of-the-art and for identifying open challenges and future avenues of research. However, on-device learning is an expansive field with connections to a large number of related topics in AI and machine learning (including online learning, model adaptation, one/few-shot learning, etc.). Hence, covering such a large number of topics in a single survey is impractical. This survey finds a middle ground by reformulating the problem of on-device learning as resource constrained learning where the resources are compute and memory. This reformulation allows tools, techniques, and algorithms from a wide variety of research areas to be compared equitably. In addition to summarizing the state-of-the-art, the survey also identifies a number of challenges and next steps for both the algorithmic and theoretical aspects of on-device learning.

OCOct 15, 2019
Variable Metric Proximal Gradient Method with Diagonal Barzilai-Borwein Stepsize

Youngsuk Park, Sauptik Dhar, Stephen Boyd et al.

Variable metric proximal gradient (VM-PG) is a widely used class of convex optimization method. Lately, there has been a lot of research on the theoretical guarantees of VM-PG with different metric selections. However, most such metric selections are dependent on (an expensive) Hessian, or limited to scalar stepsizes like the Barzilai-Borwein (BB) stepsize with lots of safeguarding. Instead, in this paper we propose an adaptive metric selection strategy called the diagonal Barzilai-Borwein (BB) stepsize. The proposed diagonal selection better captures the local geometry of the problem while keeping per-step computation cost similar to the scalar BB stepsize i.e. $O(n)$. Under this metric selection for VM-PG, the theoretical convergence is analyzed. Our empirical studies illustrate the improved convergence results under the proposed diagonal BB stepsize, specifically for ill-conditioned machine learning problems for both synthetic and real-world datasets.

LGSep 21, 2019
Single Class Universum-SVM

Sauptik Dhar, Vladimir Cherkassky

This paper extends the idea of Universum learning [1, 2] to single-class learning problems. We propose Single Class Universum-SVM setting that incorporates a priori knowledge (in the form of additional data samples) into the single class estimation problem. These additional data samples or Universum belong to the same application domain as (positive) data samples from a single class (of interest), but they follow a different distribution. Proposed methodology for single class U-SVM is based on the known connection between binary classification and single class learning formulations [3]. Several empirical comparisons are presented to illustrate the utility of the proposed approach.

LGMay 14, 2019
Improving Model Training by Periodic Sampling over Weight Distributions

Samarth Tripathi, Jiayi Liu, Unmesh Kurup et al.

In this paper, we explore techniques centered around periodic sampling of model weights that provide convergence improvements on gradient update methods (vanilla \acs{SGD}, Momentum, Adam) for a variety of vision problems (classification, detection, segmentation). Importantly, our algorithms provide better, faster and more robust convergence and training performance with only a slight increase in computation time. Our techniques are independent of the neural network model, gradient optimization methods or existing optimal training policies and converge in a less volatile fashion with performance improvements that are approximately monotonic. We conduct a variety of experiments to quantify these improvements and identify scenarios where these techniques could be more useful.

LGAug 23, 2018
Multiclass Universum SVM

Sauptik Dhar, Vladimir Cherkassky, Mohak Shah

We introduce Universum learning for multiclass problems and propose a novel formulation for multiclass universum SVM (MU-SVM). We also propose an analytic span bound for model selection with almost 2-4x faster computation times than standard resampling techniques. We empirically demonstrate the efficacy of the proposed MUSVM formulation on several real world datasets achieving > 20% improvement in test accuracies compared to multi-class SVM.

LGSep 29, 2016
Universum Learning for Multiclass SVM

Sauptik Dhar, Naveen Ramakrishnan, Vladimir Cherkassky et al.

We introduce Universum learning for multiclass problems and propose a novel formulation for multiclass universum SVM (MU-SVM). We also propose a span bound for MU-SVM that can be used for model selection thereby avoiding resampling. Empirical results demonstrate the effectiveness of MU-SVM and the proposed bound.

LGMay 27, 2016
Universum Learning for SVM Regression

Sauptik Dhar, Vladimir Cherkassky

This paper extends the idea of Universum learning [18, 19] to regression problems. We propose new Universum-SVM formulation for regression problems that incorporates a priori knowledge in the form of additional data samples. These additional data samples or Universum belong to the same application domain as the training samples, but they follow a different distribution. Several empirical comparisons are presented to illustrate the utility of the proposed approach.