Haricharan Balasundaram

IT
4papers
2citations
Novelty55%
AI Score47

4 Papers

72.3ITMay 8
Learning to Transmit Over Unknown Erasure Channels with Empirical Erasure Rate Feedback

Haricharan Balasundaram, Krishna Jagannathan

We address the problem of reliable data transmission within a finite time horizon $T$ over a binary erasure channel with unknown erasure probability. We consider a feedback model wherein the transmitter can query the receiver infrequently and obtain the empirical erasure rate experienced by the latter. We aim to minimize a regret quantity, i.e. how much worse a strategy performs compared to an oracle who knows the probability of erasure, while operating at the same block error rate. A learning vs. exploitation dilemma manifests in this scenario -- specifically, we need to balance between (i) learning the erasure probability with reasonable accuracy and (ii) utilizing the channel to transmit as many information bits as possible. We propose two strategies: (i) a two-phase approach using rate estimation followed by transmission that achieves an $O({T}^{\frac 23})$ regret using only one query, and (ii) a windowing strategy using geometrically-increasing window sizes that achieves an $O({\sqrt{T}})$ regret using $O(\log(T))$ queries.

37.0LGMay 8
Convex Optimization with Nested Evolving Feasible Sets

Karthick Krishna M., Haricharan Balasundaram, Rahul Vaze

Convex Optimization with Nested Evolving Feasible Sets (CONES)} is considered where the objective function $f$ remains fixed but the feasible region evolves over time as a nested sequence $S_1 \supseteq S_2 \supseteq \cdots \supseteq S_T$. The goal of an online algorithm is to simultaneously minimize the regret with respect to hindsight static optimal benchmark and the total movement cost while ensuring feasibility at all times. CONES is an optimization-oriented generalization of the well-known nested convex body chasing problem. When the loss function is convex, we propose a lazy-algorithm and show that it achieves $O(T^{1-β}), O(T^β)$ simultaneous regret and movement cost for any $β\in (0,1]$, over a time horizon of $T$. When the loss function is strongly convex or $α$-sharp, we propose an algorithm Frugal that simultaneously achieves zero regret and a movement cost of $O(\log T)$. To complement this, we show that any online algorithm with $o(T)$ regret has a movement cost of $Ω(\log{T})$ for both cases, proving optimality of Frugal.

11.5LGMar 21
Breaking the $O(\sqrt{T})$ Cumulative Constraint Violation Barrier while Achieving $O(\sqrt{T})$ Static Regret in Constrained Online Convex Optimization

Haricharan Balasundaram, Karthick Krishna Mahendran, Rahul Vaze

The problem of constrained online convex optimization is considered, where at each round, once a learner commits to an action $x_t \in \mathcal{X} \subset \mathbb{R}^d$, a convex loss function $f_t$ and a convex constraint function $g_t$ that drives the constraint $g_t(x)\le 0$ are revealed. The objective is to simultaneously minimize the static regret and cumulative constraint violation (CCV) compared to the benchmark that knows the loss functions and constraint functions $f_t$ and $g_t$ for all $t$ ahead of time, and chooses a static optimal action that is feasible with respect to all $g_t(x)\le 0$. In recent prior work Sinha and Vaze [2024], algorithms with simultaneous regret of $O(\sqrt{T})$ and CCV of $O(\sqrt{T})$ or (CCV of $O(1)$ in specific cases Vaze and Sinha [2025], e.g. when $d=1$) have been proposed. It is widely believed that CCV is $Ω(\sqrt{T})$ for all algorithms that ensure that regret is $O(\sqrt{T})$ with the worst case input for any $d\ge 2$. In this paper, we refute this and show that the algorithm of Vaze and Sinha [2025] simultaneously achieves regret of $O(\sqrt{T})$ regret and CCV of $O(T^{1/3})$ when $d=2$.

27.1ITMay 8
Semantic Smoothing for Language Models via Distribution Estimation and Embeddings

Haricharan Balasundaram, Swathi Shree Narashiman, Pranay Mathur et al.

We propose semantic smoothing, a smoothing method for language models that uses embeddings to share statistical observations across semantically similar contexts. The starting point is a decomposition of log-perplexity that motivates smoothing as a collection of distribution-estimation problems under Kullback-Leibler (KL) loss. We then show that, under a Lipschitz-logit model for embedding-based language generation, proximity of context embeddings implies proximity of the corresponding next-word distributions in KL divergence. Combining these observations, we formulate semantic smoothing as distribution estimation in KL loss with KL-proximity side information. For $n$ samples on a $d$-symbol alphabet with a side-information distribution at KL distance $Δ$, we give an interpolation estimator with worst-case KL risk $O(\min\{Δ,d/n\})$, and prove a matching-order lower bound for uniform side information. We extend the estimator to multiple and empirically estimated synonymous distributions. Experiments on synthetic Markov data and WikiText-103 bigram models using Word2Vec, GloVe, and GPT-2 embeddings show that semantic smoothing consistently reduces test perplexity when applied to add-constant and Kneser-Ney estimates.