Donald Kougang-Yombi

h-index5
2papers

2 Papers

1.0ITApr 3
Weight distribution bounds to relate minimum distance, list decoding, and symmetric channel performance

Donald Kougang-Yombi, Jan Hązła

We study relationships between worst-case and random-noise properties of error correcting codes. More concretely, we consider connections between minimum distance, list decoding radius, and block error probability on noisy channels. A recent result of Pernice, Sprumont, and Wootters established the tight connection between list decoding radius and symmetric channel performance for linear codes. We extend this result to general codes. The proof proceeds by directly bounding the weight distribution rather than by sharp threshold techniques. We then turn to the relation between minimum distance and symmetric channel performance. The results we just described imply that a $q$-ary code of relative distance $δ$ has vanishing error probability on the symmetric channel up to the Johnson radius $J_q(δ)$. We improve upon this bound in the case of linear codes, for a range $δ$, for $q\ge 4$. In our proof we consider the \emph{erasure} properties of codes, and bound their weight distribution through inequalities introduced by Samorodnitsky. The latter part of the proof gives a more general technique that bounds the symmetric channel performance of a linear code with constant relative distance and good erasure channel performance.

LGDec 6, 2024
Learning High-Degree Parities: The Crucial Role of the Initialization

Emmanuel Abbe, Elisabetta Cornacchia, Jan Hązła et al.

Parities have become a standard benchmark for evaluating learning algorithms. Recent works show that regular neural networks trained by gradient descent can efficiently learn degree $k$ parities on uniform inputs for constant $k$, but fail to do so when $k$ and $d-k$ grow with $d$ (here $d$ is the ambient dimension). However, the case where $k=d-O_d(1)$ (almost-full parities), including the degree $d$ parity (the full parity), has remained unsettled. This paper shows that for gradient descent on regular neural networks, learnability depends on the initial weight distribution. On one hand, the discrete Rademacher initialization enables efficient learning of almost-full parities, while on the other hand, its Gaussian perturbation with large enough constant standard deviation $σ$ prevents it. The positive result for almost-full parities is shown to hold up to $σ=O(d^{-1})$, pointing to questions about a sharper threshold phenomenon. Unlike statistical query (SQ) learning, where a singleton function class like the full parity is trivially learnable, our negative result applies to a fixed function and relies on an initial gradient alignment measure of potential broader relevance to neural networks learning.