Back to Explore
cs.DSComputer Science

Data Structures & Algorithms

Algorithm design, complexity, data structures

40.2CLMay 29Code
Incremental BPE Tokenization

Shenghu Jiang, Ruihao Gong

This work addresses the problem of inefficient tokenization in streaming settings for large language model pipelines, offering practical latency benefits for developers and users.

31.7LGMay 11
Mistake-Bounded Language Generation

Jon Kleinberg, Charlotte Peale, Omer Reingold

This work addresses the problem of cumulative errors in language generation for learning theorists, offering a formal framework and trade-off analysis that advances the theoretical understanding of mistake-bounded generation.

29.5DSMay 13
Non-Redundancy of Low-Arity Symmetric Boolean CSPs

Amatya Sharma, Santhoshini Velusamy

For CSP theorists, this provides a near-complete classification of a structural parameter governing kernelization and sparsification for a natural class of constraints.

28.4DSMar 24
Algorithmic warm starts for Hamiltonian Monte Carlo

Matthew S. Zhang, Jason M. Altschuler, Sinho Chewi

This resolves the computational bottleneck of finding warm starts for HMC, which is crucial for practitioners in statistics, engineering, and sciences who rely on HMC for high-dimensional sampling, though it is incremental as it builds on prior theoretical work.

26.3DSMar 29
An Optimal Algorithm for Stochastic Vertex Cover

Jan van den Brand, Inge Li Gørtz, Chirag Pabbaraju et al.

This solves the central open question for stochastic vertex cover, providing an optimal algorithm that matches the lower bound for any constant-factor approximation.

24.5PRMar 24
The Localization Method for High-Dimensional Inequalities

Yunbum Kook, Santosh S. Vempala

This is an incremental survey that reviews an existing method with broad applications in areas like isoperimetric inequalities, optimization, and Markov chains, but does not introduce new results.

24.4DSApr 2
Single-Pass Streaming CSPs via Two-Tier Sampling

Amir Azarmehr, Soheil Behnezhad, Shane Ferrante

This resolves a key open problem in streaming algorithms for constraint satisfaction, with implications for applications like Max-DiCut.

24.4DSMay 7
Sampling Tree-Weighted Partitions Without Sampling Trees

Sarah Cannon, Topher Pankow, Wesley Pegden et al.

For computational redistricting and other applications requiring balanced partitions, this algorithm removes a major computational bottleneck, enabling faster sampling on large planar graphs.

24.0DSJun 3
A General Framework for Dynamic Consistent Submodular Maximization

Paul Dütting, Federico Fusco, Silvio Lattanzi et al.

This work addresses the problem of maintaining near-optimal solutions in fully dynamic submodular maximization, providing the first algorithms with provable guarantees for consistency in the presence of deletions.

23.8DSApr 29
Online Monotone Metric Embeddings

Christian Coester, Yichen Huang

This work provides a new framework for online metric embeddings that overcomes fundamental lower bounds, benefiting algorithm designers who rely on HST embeddings for online problems.

23.7CCApr 2
Linear Space Streaming Lower Bounds for Approximating CSPs

Chi-Ning Chou, Alexander Golovnev, Madhu Sudan et al.

This work provides foundational streaming lower bounds for CSPs, extending prior results to general parameters and addressing a gap in linear space bounds for approximation factors less than 1/2.

23.2LGMay 7
Contrastive Identification and Generation in the Limit

Xiaoyu Li, Andi Han, Jiaojiao Jiang et al.

This work provides foundational theoretical results for learning from relational (contrastive) supervision in the limit, which is relevant to computational learning theory and may impact practical contrastive learning methods.

22.3DSMay 23
Covering vertices by sequential stars

Mengyuan Hu, An Zhang, Yong Chen et al.

For researchers in graph algorithms and approximation, this work offers improved theoretical guarantees and hardness results for a natural vertex-covering problem with interval constraints.