Madhav Kanda

LG
h-index28
4papers
11citations
Novelty35%
AI Score39

4 Papers

CVOct 10, 2025
Constructive Distortion: Improving MLLMs with Attention-Guided Image Warping

Dwip Dalal, Gautam Vashishtha, Utkarsh Mishra et al.

Multimodal large language models (MLLMs) often miss small details and spatial relations in cluttered scenes, leading to errors in fine-grained perceptual grounding. We introduce AttWarp, a lightweight method that allocates more resolution to query-relevant content while compressing less informative areas, all while preserving global context. At test time, the approach uses an MLLM's cross-modal attention to perform rectilinear warping of the input image, reallocating spatial resolution toward regions the model deems important, without changing model weights or architecture. This attention-guided warping preserves all original image information but redistributes it non-uniformly, so small objects and subtle relationships become easier for the same model to read while the global layout remains intact. Across five benchmarks (TextVQA, GQA, DocVQA, POPE, MMMU) and four MLLMs (LLaVA, Qwen-VL, InternVL, and InstructBLIP), AttWarp consistently improves accuracy, strengthens compositional reasoning, and reduces hallucinations, outperforming four competitive baselines that manipulate raw images at test time. Together, these results show that attention-guided warping prioritizes information relevant to the query while preserving context, and that the same MLLMs perform better when given such warped inputs.

LGOct 30, 2024
SpiroActive: Active Learning for Efficient Data Acquisition for Spirometry

Ankita Kumari Jain, Nitish Sharma, Madhav Kanda et al.

Respiratory illnesses are a significant global health burden. Respiratory illnesses, primarily Chronic obstructive pulmonary disease (COPD), is the seventh leading cause of poor health worldwide and the third leading cause of death worldwide, causing 3.23 million deaths in 2019, necessitating early identification and diagnosis for effective mitigation. Among the diagnostic tools employed, spirometry plays a crucial role in detecting respiratory abnormalities. However, conventional clinical spirometry methods often entail considerable costs and practical limitations like the need for specialized equipment, trained personnel, and a dedicated clinical setting, making them less accessible. To address these challenges, wearable spirometry technologies have emerged as promising alternatives, offering accurate, cost-effective, and convenient solutions. The development of machine learning models for wearable spirometry heavily relies on the availability of high-quality ground truth spirometry data, which is a laborious and expensive endeavor. In this research, we propose using active learning, a sub-field of machine learning, to mitigate the challenges associated with data collection and labeling. By strategically selecting samples from the ground truth spirometer, we can mitigate the need for resource-intensive data collection. We present evidence that models trained on small subsets obtained through active learning achieve comparable/better results than models trained on the complete dataset.

AIFeb 20
WorkflowPerturb: Calibrated Stress Tests for Evaluating Multi-Agent Workflow Metrics

Madhav Kanda, Pedro Las-Casas, Alok Gautam Kumbhare et al.

LLM-based systems increasingly generate structured workflows for complex tasks. In practice, automatic evaluation of these workflows is difficult, because metric scores are often not calibrated, and score changes do not directly communicate the severity of workflow degradation. We introduce WorkflowPerturb, a controlled benchmark for studying workflow evaluation metrics. It works by applying realistic, controlled perturbations to golden workflows. WorkflowPerturb contains 4,973 golden workflows and 44,757 perturbed variants across three perturbation types (Missing Steps, Compressed Steps, and Description Changes), each applied at severity levels of 10%, 30%, and 50%. We benchmark multiple metric families and analyze their sensitivity and calibration using expected score trajectories and residuals. Our results characterize systematic differences across metric families and support severity-aware interpretation of workflow evaluation scores. Our dataset will be released upon acceptance.

LGSep 1, 2025
REFINESTAT: Efficient Exploration for Probabilistic Program Synthesis

Madhav Kanda, Shubham Ugare, Sasa Misailovic

Probabilistic programming offers a powerful framework for modeling uncertainty, yet statistical model discovery in this domain entails navigating an immense search space under strict domain-specific constraints. When small language models are tasked with generating probabilistic programs, they frequently produce outputs that suffer from both syntactic and semantic errors, such as flawed inference constructs. Motivated by probabilistic programmers' domain expertise and debugging strategies, we introduce RefineStat, a language model--driven framework that enforces semantic constraints ensuring synthesized programs contain valid distributions and well-formed parameters, and then applies diagnostic-aware refinement by resampling prior or likelihood components whenever reliability checks fail. We evaluate RefineStat on multiple probabilistic-programming code-generation tasks using smaller language models (SLMs) and find that it produces programs that are both syntactically sound and statistically reliable, often matching or surpassing those from closed-source large language models (e.g., OpenAI o3).