Ryan Jeong

32.3ITApr 8

The Random Subsequence Model and Uniform Codes for the Deletion Channel

Ryan Jeong, Francisco Pernice

We introduce the Random Subsequence Model, a spin glass model on pairs of random strings $(X,Y) \in \{0,1\}^N \times \{0,1\}^M$ whose partition function counts subsequence embeddings of $Y$ into $X$. We study two variants: the null model, where $X$ and $Y$ are independent and uniform, and the planted model, where $X$ is uniform and $Y$ is a uniformly-random length-$M$ subsequence of $X$. We connect the Random Subsequence Model to longstanding problems in various fields, including the best rate achievable by uniformly-random codes in the deletion channel, the longest common subsequence problem between two random strings, and models of directed polymers in statistical physics. In the regime where $N,M\to\infty$ at a fixed ratio $Î±= M/N \in (0,1)$, we exhibit strict asymptotic separations between the null annealed free energy and the quenched free energies of the null and planted models at all values of the density parameter $Î±$. This suggests that these models are in a spin glass phase at zero temperature throughout the entire dense regime. As a consequence, we show that uniformly-random codes achieve a positive rate in the deletion channel for all deletion probabilities $p\in [0,1),$ settling multiple conjectures of the second author, Isik and Weissman (2024) and proving the first such positive rate result for the regime $p \geq 1/2$. We also give an exact analytic formula for the annealed free energy of the planted model for all values of the density parameter. This implies a corresponding analytic upper bound on the best rate achievable by uniformly-random codes in the deletion channel, complementing the lower bound from our first result. Our upper and lower bounds for the capacity of the deletion channel under uniform codes are far closer to each other than the best known upper and lower bounds for the capacity of the deletion channel.

MLFeb 21, 2021

Deep ReLU Networks Preserve Expected Length

Boris Hanin, Ryan Jeong, David Rolnick

Assessing the complexity of functions computed by a neural network helps us understand how the network will learn and generalize. One natural measure of complexity is how the network distorts length - if the network takes a unit-length curve as input, what is the length of the resulting curve of outputs? It has been widely believed that this length grows exponentially in network depth. We prove that in fact this is not the case: the expected length distortion does not grow with depth, and indeed shrinks slightly, for ReLU networks with standard random initialization. We also generalize this result by proving upper bounds both for higher moments of the length distortion and for the distortion of higher-dimensional volumes. These theoretical results are corroborated by our experiments.

Ryan Jeong

2 Papers