Kunal Dutta

DS
h-index2
4papers
6citations
Novelty63%
AI Score43

4 Papers

44.5DSApr 14
Near-Optimal Constructive Bounds for $\ell_2$ Prefix Discrepancy and Steinitz Problems via Affine Spectral Independence

Kunal Dutta, Agastya Vibhuti Jha, Haotian Jiang

A classical result of Steinitz from 1913 \cite{Ste13}, answering an earlier question of Riemann and Lévy (e.g., \cite{Lev05}), states that for any norm $\|\cdot\|$ in $\mathbb{R}^d$ and any set of vectors $v_1, \cdots, v_n \in \R^d$ satisfying $\sum_{i=1}^n v_i = 0$, there exists an ordering $π: [n] \rightarrow [n]$ such that every partial sum along this order is bounded by $O(d)$, i.e., $\big\| \sum_{i=1}^t v_{π(i)} \big\| \leq O(d)$ for all $t \in [n]$. Steinitz's bound is tight up to constants in general, but for the $\ell_2$ norm $\|\cdot\|_2$, it has been conjectured that the best bound is $O(\sqrt{d})$. Almost a century later, a breakthrough work of Banaszczyk \cite{Ban12} gave a bound of $O(\sqrt{d} + \sqrt{\log n})$ for the $\ell_2$ Steinitz problem, matching the conjecture under the mild assumption that $d \geq Ω(\log n)$. Banaszczyk's result is non-constructive, and the previous best algorithmic bound was $O(\sqrt{d \log n})$, due to Bansal and Garg \cite{BG17}. In this work, we give an efficient algorithm that matches the conjectured $O(\sqrt{d})$ bound for the $\ell_2$ Steinitz problem under the slightly worse, yet still polylogarithmic, condition of $d \geq Ω(\log^7 n)$. As in prior work, our result extends to the harder problem of $\ell_2$ prefix discrepancy. We employ the framework of obtaining the desired ordering via a discrete Brownian motion, guided by a semidefinite program (SDP). To obtain our results, we use the new technique of ``Decoupling via Affine Spectral Independence'', proposed by Bansal and Jiang \cite{BJ26} to achieve substantial progress on the Beck-Fiala and Komlós conjectures, together with a ``Global Interval Tree'' data structure that simultaneously controls the deviations for all prefixes.

LGMar 9, 2024
DiffRed: Dimensionality Reduction guided by stable rank

Prarabdh Shukla, Gagan Raj Gupta, Kunal Dutta

In this work, we propose a novel dimensionality reduction technique, DiffRed, which first projects the data matrix, A, along first $k_1$ principal components and the residual matrix $A^{*}$ (left after subtracting its $k_1$-rank approximation) along $k_2$ Gaussian random vectors. We evaluate M1, the distortion of mean-squared pair-wise distance, and Stress, the normalized value of RMS of distortion of the pairwise distances. We rigorously prove that DiffRed achieves a general upper bound of $O\left(\sqrt{\frac{1-p}{k_2}}\right)$ on Stress and $O\left(\frac{(1-p)}{\sqrt{k_2*ρ(A^{*})}}\right)$ on M1 where $p$ is the fraction of variance explained by the first $k_1$ principal components and $ρ(A^{*})$ is the stable rank of $A^{*}$. These bounds are tighter than the currently known results for Random maps. Our extensive experiments on a variety of real-world datasets demonstrate that DiffRed achieves near zero M1 and much lower values of Stress as compared to the well-known dimensionality reduction techniques. In particular, DiffRed can map a 6 million dimensional dataset to 10 dimensions with 54% lower Stress than PCA.

DSNov 19, 2021
Uniform Brackets, Containers, and Combinatorial Macbeath Regions

Kunal Dutta, Arijit Ghosh, Shay Moran

We study the connections between three seemingly different combinatorial structures - "uniform" brackets in statistics and probability theory, "containers" in online and distributed learning theory, and "combinatorial Macbeath regions", or Mnets in discrete and computational geometry. We show that these three concepts are manifestations of a single combinatorial property that can be expressed under a unified framework along the lines of Vapnik-Chervonenkis type theory for uniform convergence. These new connections help us to bring tools from discrete and computational geometry to prove improved bounds for these objects. Our improved bounds help to get an optimal algorithm for distributed learning of halfspaces, an improved algorithm for the distributed convex set disjointness problem, and improved regret bounds for online algorithms against a smoothed adversary for a large class of semi-algebraic threshold functions.

DMDec 12, 2014
Size sensitive packing number for Hamming cube and its consequences

Kunal Dutta, Arijit Ghosh

We prove a size-sensitive version of Haussler's Packing lemma~\cite{Haussler92spherepacking} for set-systems with bounded primal shatter dimension, which have an additional {\em size-sensitive property}. This answers a question asked by Ezra~\cite{Ezra-sizesendisc-soda-14}. We also partially address another point raised by Ezra regarding overcounting of sets in her chaining procedure. As a consequence of these improvements, we get an improvement on the size-sensitive discrepancy bounds for set systems with the above property. Improved bounds on the discrepancy for these special set systems also imply an improvement in the sizes of {\em relative $(\varepsilon, δ)$-approximations} and $(ν, α)$-samples.