Kshiteej Sheth

h-index1

3papers

5citations

Novelty42%

AI Score34

Ranked #109,992 of 194,257 authors (top 57%)#36,749 in CV (top 62%)

3 Papers

11.4LGJul 31, 2025

Improved Algorithms for Kernel Matrix-Vector Multiplication Under Sparsity Assumptions

Piotr Indyk, Michael Kapralov, Kshiteej Sheth et al.

Motivated by the problem of fast processing of attention matrices, we study fast algorithms for computing matrix-vector products for asymmetric Gaussian Kernel matrices $K\in \mathbb{R}^{n\times n}$. $K$'s columns are indexed by a set of $n$ keys $k_1,k_2\ldots, k_n\in \mathbb{R}^d$, rows by a set of $n$ queries $q_1,q_2,\ldots,q_n\in \mathbb{R}^d $, and its $i,j$ entry is $K_{ij} = e^{-\|q_i-k_j\|_2^2/2σ^2}$ for some bandwidth parameter $σ>0$. Given a vector $x\in \mathbb{R}^n$ and error parameter $ε>0$, our task is to output a $y\in \mathbb{R}^n$ such that $\|Kx-y\|_2\leq ε\|x\|_2$ in time subquadratic in $n$ and linear in $d$. Our algorithms rely on the following modelling assumption about the matrices $K$: the sum of the entries of $K$ scales linearly in $n$, as opposed to worst case quadratic growth. We validate this assumption experimentally, for Gaussian kernel matrices encountered in various settings such as fast attention computation in LLMs. We obtain the first subquadratic-time algorithm that works under this assumption, for unrestricted vectors.

1.0MLNov 30, 2017

Improved Linear Embeddings via Lagrange Duality

Kshiteej Sheth, Dinesh Garg, Anirban Dasgupta

Near isometric orthogonal embeddings to lower dimensions are a fundamental tool in data science and machine learning. In this paper, we present the construction of such embeddings that minimizes the maximum distortion for a given set of points. We formulate the problem as a non convex constrained optimization problem. We first construct a primal relaxation and then use the theory of Lagrange duality to create dual relaxation. We also suggest a polynomial time algorithm based on the theory of convex optimization to solve the dual relaxation provably. We provide a theoretical upper bound on the approximation guarantees for our algorithm, which depends only on the spectral properties of the dataset. We experimentally demonstrate the superiority of our algorithm compared to baselines in terms of the scalability and the ability to achieve lower distortion.

2.1CVSep 4, 2016

Deep Neural Networks for HDR imaging

Kshiteej Sheth

We propose novel methods of solving two tasks using Convolutional Neural Networks, firstly the task of generating HDR map of a static scene using differently exposed LDR images of the scene captured using conventional cameras and secondly the task of finding an optimal tone mapping operator that would give a better score on the TMQI metric compared to the existing methods. We quantitatively show the performance of our networks and illustrate the cases where our networks performs good as well as bad.