ITDSITMay 7

An Additive Approximation Scheme for Generating Dyadic Codings for the Outputs of an LLM

arXiv:2605.0583710.2h-index: 16
Predicted impact top 95% in IT · last 90 daysOriginality Incremental advance
AI Analysis

For researchers in information theory and LLM security, this provides a principled framework for embedding hidden information with provable statistical undetectability, though the problem is niche and the approach is incremental.

The paper develops a polynomial-time additive approximation scheme for generating dyadic codings of discrete distributions (e.g., LLM next-token distributions) under rate constraints, minimizing total variation distance. The method provides provable near-optimal guarantees and is applied to LLM steganography, where rate corresponds to hidden bits per token and total variation bounds detectability.

We study the problem of approximating a discrete probability distribution, such as the next-token distribution of a large language model, by a dyadic distribution induced by a binary tree under encoding rate constraints. The objective is to partition the support of the distribution and assign dyadic probabilities to minimize total variation distance while achieving a prescribed rate. We formulate this task as a tree-based partitioning problem and develop a polynomial-time additive approximation scheme for the rate-constrained setting in the constant-rate regime. Our results provide provable guarantees for near-optimal dyadic approximations and, as an application, yield a principled framework for LLM-based steganography, where the rate maps to bits of hidden information embedded per token and the total variation bound controls statistical detectability.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes