K. Sudhir

LG
4papers
43citations
Novelty55%
AI Score28

4 Papers

CLAug 13, 2024
Beyond Mimicry to Contextual Guidance: Knowledge Distillation for Interactive AI

Tong Wang, K. Sudhir

As large language models increasingly mediate firm - customer interactions, firms face a tradeoff: the most capable models perform well but are costly and difficult to control at scale. Existing knowledge distillation methods address this challenge by training weaker, deployable models to imitate frontier outputs; however, such open-loop approaches are poorly suited to interactive, multi-turn settings where responses must be sequenced coherently across conversational states. We propose a shift in what knowledge is distilled - from output imitation to contextual guidance. We develop a framework in which a superior teacher model constructs a reusable library of strategic textual guidance for particular scenarios likely to be encountered by the student. When deployed, the student retrieves the context-specific guidance at inference time, enabling adaptive behavior without retraining. Using customer-service interactions, we show that this approach improves service quality and customer satisfaction relative to standard fine-tuning while maintaining alignment with firm policies. The results position inference-time textual guidance as a scalable and controllable approach to distillation for interactive AI agents in marketing settings.

SDAug 13, 2024
A Theory-Based Explainable Deep Learning Architecture for Music Emotion

Hortense Fong, Vineet Kumar, K. Sudhir

This paper paper develops a theory-based, explainable deep learning convolutional neural network (CNN) classifier to predict the time-varying emotional response to music. We design novel CNN filters that leverage the frequency harmonics structure from acoustic physics known to impact the perception of musical features. Our theory-based model is more parsimonious, but provides comparable predictive performance to atheoretical deep learning models, while performing better than models using handcrafted features. Our model can be complemented with handcrafted features, but the performance improvement is marginal. Importantly, the harmonics-based structure placed on the CNN filters provides better explainability for how the model predicts emotional response (valence and arousal), because emotion is closely related to consonance--a perceptual feature defined by the alignment of harmonics. Finally, we illustrate the utility of our model with an application involving digital advertising. Motivated by YouTube mid-roll ads, we conduct a lab experiment in which we exogenously insert ads at different times within videos. We find that ads placed in emotionally similar contexts increase ad engagement (lower skip rates, higher brand recall rates). Ad insertion based on emotional similarity metrics predicted by our theory-based, explainable model produces comparable or better engagement relative to atheoretical models.

LGOct 28, 2021
Coresets for Time Series Clustering

Lingxiao Huang, K. Sudhir, Nisheeth K. Vishnoi

We study the problem of constructing coresets for clustering problems with time series data. This problem has gained importance across many fields including biology, medicine, and economics due to the proliferation of sensors facilitating real-time measurement and rapid drop in storage costs. In particular, we consider the setting where the time series data on $N$ entities is generated from a Gaussian mixture model with autocorrelations over $k$ clusters in $\mathbb{R}^d$. Our main contribution is an algorithm to construct coresets for the maximum likelihood objective for this mixture model. Our algorithm is efficient, and under a mild boundedness assumption on the covariance matrices of the underlying Gaussians, the size of the coreset is independent of the number of entities $N$ and the number of observations for each entity, and depends only polynomially on $k$, $d$ and $1/\varepsilon$, where $\varepsilon$ is the error parameter. We empirically assess the performance of our coreset with synthetic data.

LGNov 2, 2020
Coresets for Regressions with Panel Data

Lingxiao Huang, K. Sudhir, Nisheeth K. Vishnoi

This paper introduces the problem of coresets for regression problems to panel data settings. We first define coresets for several variants of regression problems with panel data and then present efficient algorithms to construct coresets of size that depend polynomially on 1/$\varepsilon$ (where $\varepsilon$ is the error parameter) and the number of regression parameters - independent of the number of individuals in the panel data or the time units each individual is observed for. Our approach is based on the Feldman-Langberg framework in which a key step is to upper bound the "total sensitivity" that is roughly the sum of maximum influences of all individual-time pairs taken over all possible choices of regression parameters. Empirically, we assess our approach with synthetic and real-world datasets; the coreset sizes constructed using our approach are much smaller than the full dataset and coresets indeed accelerate the running time of computing the regression objective.