Songtao Mao

2papers

2 Papers

45.8CRMay 19
High-Rate Public-Key Pseudorandom Codes for Edit Errors

Shengtang Huang, Xin Li, Songtao Mao et al.

Pseudorandom codes (PRCs), introduced by Christ and Gunn (CRYPTO '2024), are error-correcting codes whose codewords are computationally indistinguishable from uniformly random strings, while still being decodable by someone holding the key. They provide a natural primitive for robust and undetectable watermarking, particularly in applications to AI-generated content. Although recent works have obtained strong results for substitution errors, the edit-error setting remains much less understood, especially in the high-rate regime and over small alphabets. We study public-key pseudorandom codes against edit errors. First, we give a new reduction showing that binary zero-bit PRCs robust against a constant fraction of substitution errors can be transformed into binary zero-bit PRCs robust against edit errors. Consequently, under any assumption that yields zero-bit Hamming-robust PRCs, one also obtains zero-bit PRCs for edit channels, albeit only for the weaker class of sublinear polynomial edit channels, namely channels with edit error rate $1/n^γ$ for any constant $γ>0$. In the high-rate regime, we construct public-key PRCs with rate arbitrarily close to $1$ over sufficiently large constant alphabets, and with rate arbitrarily close to $1/2$ over the binary alphabet. Moreover, if we allow the alphabet size to be $\mathrm{poly}(λ)$, where $λ$ is the security parameter, then our public-key PRCs can attain the Singleton bound for insertion-deletion channels. Taken together, these results yield the first high-rate public-key binary PRC constructions for edit channels, under the same assumption that yields zero-bit Hamming-robust PRCs.

23.7CCApr 12
Near Optimal Algorithms for Noisy $k$-XOR under Low-Degree Heuristic

Songtao Mao

Noisy $k$-XOR is a basic average-case inference problem in which one observes random noisy $k$-ary parity constraints and seeks to recover, or more weakly, detect, a hidden Boolean assignment. A central question is to characterize the tradeoff among sample complexity, noise level, and running time. We give a recovery algorithm, and hence also a detection algorithm, for noisy $k$-XOR in the high-noise regime. For every parameter $D$, our algorithm runs in time $n^{D+O(1)}$ and succeeds whenever $$ m \ge C_k \frac{n^{k/2}}{D^{\,k/2-1}δ^2}, $$ where $C_k$ is an explicit constant depending only on $k$, and $δ$ is the noise bias. Our result matches the best previously known time--sample tradeoff for detection, while simultaneously yielding recovery guarantees. In addition, the dependence on the noise bias $δ$ is optimal up to constant factors, matching the information-theoretic scaling. We also prove matching low-degree lower bounds. In particular, we show that the degree-$D$ low-degree likelihood ratio has bounded $L^2$-norm below the same threshold, up to the same factor $D^{k/2-1}$. Under the low-degree heuristic, this implies that our algorithm is near-optimal over a broad range of parameters. Our approach combines a refined second-moment analysis with color coding and dynamic programming for structured hypergraph embedding statistics. These techniques may be of independent interest for other average-case inference problems.