95.5ITApr 16
Reed--Muller Codes Achieve the Symmetric Capacity on Finite-State ChannelsHenry D. Pfister, Navin Kashyap, Jean-Francois Chamberland et al.
We study reliable communication over finite-state channels (FSCs) using Reed--Muller (RM) codes. Building on recent symmetry-based analyses for memoryless channels, we show that a sequence of binary RM codes (with some random scrambling) can achieve the symmetric capacity (or uniform-input information rate) of a binary-input indecomposable FSC. Our approach has three components. First, we establish a capacity-via-symmetry theorem for doubly-transitive group codes on discrete memoryless channels (DMCs) with non-binary inputs, under some symmetry and puncturing conditions. Then, we reduce a binary-input FSC to an almost memoryless non-binary channel by grouping adjacent input bits into blocks and interleaving non-binary codes onto the channel. Finally, we show that the interleaved non-binary codes can be constructed from a single binary RM code.
ITSep 6, 2023
Data-Driven Neural Polar Codes for Unknown Channels With and Without MemoryZiv Aharoni, Bashar Huleihel, Henry D. Pfister et al.
In this work, a novel data-driven methodology for designing polar codes for channels with and without memory is proposed. The methodology is suitable for the case where the channel is given as a "black-box" and the designer has access to the channel for generating observations of its inputs and outputs, but does not have access to the explicit channel model. The proposed method leverages the structure of the successive cancellation (SC) decoder to devise a neural SC (NSC) decoder. The NSC decoder uses neural networks (NNs) to replace the core elements of the original SC decoder, the check-node, the bit-node and the soft decision. Along with the NSC, we devise additional NN that embeds the channel outputs into the input space of the SC decoder. The proposed method is supported by theoretical guarantees that include the consistency of the NSC. Also, the NSC has computational complexity that does not grow with the channel memory size. This sets its main advantage over successive cancellation trellis (SCT) decoder for finite state channels (FSCs) that has complexity of $O(|\mathcal{S}|^3 N\log N)$, where $|\mathcal{S}|$ denotes the number of channel states. We demonstrate the performance of the proposed algorithms on memoryless channels and on channels with memory. The empirical results are compared with the optimal polar decoder, given by the SC and SCT decoders. We further show that our algorithms are applicable for the case where there SC and SCT decoders are not applicable.
11.1QUANT-PHMay 22
Towards Scalable Quaternary Message-Passing Decoding for Quantum Error CorrectionBoqing Zhang, Henry D. Pfister, Hanwen Yao et al.
The scalability and interpretability of message-passing (MP) decoding, such as (quaternary) Belief Propagation, remain open challenges in quantum error correction. Even for surface codes, arguably the first testbed for decoding methods, studies of improved MP decoders have mostly been restricted to small distances ($d \lesssim 19$). Moreover, the mismatch with established message-passing theory limits the decoder's interpretability, making it unclear whether MP decoding can sustain its effectiveness at large system sizes. This work takes a step toward a more principled and interpretable MP decoding framework, with the goal of making MP-based decoding more reliable and bridging theory and practice. We introduce a dilution method, which allows a quaternary Min-Sum (MS) decoder to exhibit an apparent depolarizing threshold of $16\%$ up to distance $20$, outperforming Minimum-Weight Perfect Matching in finite-length regimes. Notably, for $X$-noise, the standard MS decoder under dilution has worst-case complexity $O(N \log^2 d)$ and outperforms BP-OSD at $d=65$. The observed $\sim 9\%$ threshold may correspond to a true asymptotic threshold. Finally, we give a graph-dilution argument that interprets the success of the dilution method and offers insight into when MP algorithms can genuinely scale. Taken together, these results provide encouraging progress toward scalable and interpretable MP decoding in quantum error correction.
76.2QUANT-PHApr 14
Quantum Message Passing for Factor Graphs over Finite Abelian GroupsAvijit Mandal, Henry D. Pfister
We develop a quantum message-passing framework for factor graphs over finite abelian groups. Our starting point is the task of discriminating between a collection of quantum states indexed by the elements of a finite abelian group $\mathcal{G}$ whose overlaps respect the structure of a group-covariant pure-state channel (PSC). For such channels, we show that the Gram matrix constructed from the output states is diagonalized by the character basis of the dual group $\widehat{\mathcal{G}}$. Hence, the channel is characterized, up to isometric equivalence, by its character-indexed eigen list. Based on this representation, we analyze the induced classical-quantum channels associated with check, equality, homomorphism, marginalization, and automorphism factors. For each factor, we derive explicit update rules showing that if the incoming messages are heralded mixtures of group-covariant PSCs, then the outgoing message remains in the same class. This provides a closed quantum message-passing framework for tree-structured factor graphs assembled from these primitives. The framework applies directly to several standard code families over finite abelian groups, including polar codes, LDPC codes, and convolutional and turbo codes. It recovers the previously studied $q$-ary formulation as the special case $(\mathcal{G}=\mathbb{Z}_q)$, while extending the belief propagation with quantum messages (BPQM) framework introduced by Renes to non-cyclic alphabets and more general factor-graph constraints described by homomorphisms between products of abelian groups.
MLFeb 4, 2025
Information-Theoretic Proofs for Diffusion SamplingGalen Reeves, Henry D. Pfister
This paper provides an elementary, self-contained analysis of diffusion-based sampling methods for generative modeling. In contrast to existing approaches that rely on continuous-time processes and then discretize, our treatment works directly with discrete-time stochastic processes and yields precise non-asymptotic convergence guarantees under broad assumptions. The key insight is to couple the sampling process of interest with an idealized comparison process that has an explicit Gaussian-convolution structure. We then leverage simple identities from information theory, including the I-MMSE relationship, to bound the discrepancy (in terms of the Kullback-Leibler divergence) between these two discrete-time processes. In particular, we show that, if the diffusion step sizes are chosen sufficiently small and one can approximate certain conditional mean estimators well, then the sampling distribution is provably close to the target distribution. Our results also provide a transparent view on how to accelerate convergence by using additional randomness in each step to match higher-order moments in the comparison process.
SPOct 3, 2025
A Study of Neural Polar Decoders for CommunicationRom Hirsch, Ziv Aharoni, Henry D. Pfister et al.
In this paper, we adapt and analyze Neural Polar Decoders (NPDs) for end-to-end communication systems. While prior work demonstrated the effectiveness of NPDs on synthetic channels, this study extends the NPD to real-world communication systems. The NPD was adapted to complete OFDM and single-carrier communication systems. To satisfy practical system requirements, the NPD is extended to support any code length via rate matching, higher-order modulations, and robustness across diverse channel conditions. The NPD operates directly on channels with memory, exploiting their structure to achieve higher data rates without requiring pilots and a cyclic prefix. Although NPD entails higher computational complexity than the standard 5G polar decoder, its neural network architecture enables an efficient representation of channel statistics, resulting in manageable complexity suitable for practical systems. Experimental results over 5G channels demonstrate that the NPD consistently outperforms the 5G polar decoder in terms of BER, BLER, and throughput. These improvements are particularly significant for low-rate and short-block configurations, which are prevalent in 5G control channels. Furthermore, NPDs applied to single-carrier systems offer performance comparable to OFDM with lower PAPR, enabling effective single-carrier transmission over 5G channels. These results position the NPD as a high-performance, pilotless, and robust decoding solution.
ITJul 16, 2025
Neural Polar Decoders for Deletion ChannelsZiv Aharoni, Henry D. Pfister
This paper introduces a neural polar decoder (NPD) for deletion channels with a constant deletion rate. Existing polar decoders for deletion channels exhibit high computational complexity of $O(N^4)$, where $N$ is the block length. This limits the application of polar codes for deletion channels to short-to-moderate block lengths. In this work, we demonstrate that employing NPDs for deletion channels can reduce the computational complexity. First, we extend the architecture of the NPD to support deletion channels. Specifically, the NPD architecture consists of four neural networks (NNs), each replicating fundamental successive cancellation (SC) decoder operations. To support deletion channels, we change the architecture of only one. The computational complexity of the NPD is $O(AN\log N)$, where the parameter $A$ represents a computational budget determined by the user and is independent of the channel. We evaluate the new extended NPD for deletion channels with deletion rates $δ\in\{0.01, 0.1\}$ and we verify the NPD with the ground truth given by the trellis decoder by Tal et al. We further show that due to the reduced complexity of the NPD, we are able to incorporate list decoding and further improve performance. We believe that the extended NPD presented here could have applications in future technologies like DNA storage.
ITJun 20, 2025
Neural Polar Decoders for DNA Data StorageZiv Aharoni, Henry D. Pfister
Synchronization errors, such as insertions and deletions, present a fundamental challenge in DNA-based data storage systems, arising from both synthesis and sequencing noise. These channels are often modeled as insertion-deletion-substitution (IDS) channels, for which designing maximum-likelihood decoders is computationally expensive. In this work, we propose a data-driven approach based on neural polar decoders (NPDs) to design low-complexity decoders for channels with synchronization errors. The proposed architecture enables decoding over IDS channels with reduced complexity $O(AN log N )$, where $A$ is a tunable parameter independent of the channel. NPDs require only sample access to the channel and can be trained without an explicit channel model. Additionally, NPDs provide mutual information (MI) estimates that can be used to optimize input distributions and code design. We demonstrate the effectiveness of NPDs on both synthetic deletion and IDS channels. For deletion channels, we show that NPDs achieve near-optimal decoding performance and accurate MI estimation, with significantly lower complexity than trellis-based decoders. We also provide numerical estimates of the channel capacity for the deletion channel. We extend our evaluation to realistic DNA storage settings, including channels with multiple noisy reads and real-world Nanopore sequencing data. Our results show that NPDs match or surpass the performance of existing methods while using significantly fewer parameters than the state-of-the-art. These findings highlight the promise of NPDs for robust and efficient decoding in DNA data storage systems.
SPOct 27, 2020
Physics-Based Deep Learning for Fiber-Optic Communication SystemsChristian Häger, Henry D. Pfister
We propose a new machine-learning approach for fiber-optic communication systems whose signal propagation is governed by the nonlinear Schrödinger equation (NLSE). Our main observation is that the popular split-step method (SSM) for numerically solving the NLSE has essentially the same functional form as a deep multi-layer neural network; in both cases, one alternates linear steps and pointwise nonlinearities. We exploit this connection by parameterizing the SSM and viewing the linear steps as general linear functions, similar to the weight matrices in a neural network. The resulting physics-based machine-learning model has several advantages over "black-box" function approximators. For example, it allows us to examine and interpret the learned solutions in order to understand why they perform well. As an application, low-complexity nonlinear equalization is considered, where the task is to efficiently invert the NLSE. This is commonly referred to as digital backpropagation (DBP). Rather than employing neural networks, the proposed algorithm, dubbed learned DBP (LDBP), uses the physics-based model with trainable filters in each step and its complexity is reduced by progressively pruning filter taps during gradient descent. Our main finding is that the filters can be pruned to remarkably short lengths-as few as 3 taps/step-without sacrificing performance. As a result, the complexity can be reduced by orders of magnitude in comparison to prior work. By inspecting the filter responses, an additional theoretical justification for the learned parameter configurations is provided. Our work illustrates that combining data-driven optimization with existing domain knowledge can generate new insights into old communications problems.
SPOct 23, 2020
Model-Based Machine Learning for Joint Digital Backpropagation and PMD CompensationRick M. Bütler, Christian Häger, Henry D. Pfister et al.
In this paper, we propose a model-based machine-learning approach for dual-polarization systems by parameterizing the split-step Fourier method for the Manakov-PMD equation. The resulting method combines hardware-friendly time-domain nonlinearity mitigation via the recently proposed learned digital backpropagation (LDBP) with distributed compensation of polarization-mode dispersion (PMD). We refer to the resulting approach as LDBP-PMD. We train LDBP-PMD on multiple PMD realizations and show that it converges within 1% of its peak dB performance after 428 training iterations on average, yielding a peak effective signal-to-noise ratio of only 0.30 dB below the PMD-free case. Similar to state-of-the-art lumped PMD compensation algorithms in practical systems, our approach does not assume any knowledge about the particular PMD realization along the link, nor any knowledge about the total accumulated PMD. This is a significant improvement compared to prior work on distributed PMD compensation, where knowledge about the accumulated PMD is typically assumed. We also compare different parameterization choices in terms of performance, complexity, and convergence behavior. Lastly, we demonstrate that the learned models can be successfully retrained after an abrupt change of the PMD realization along the fiber.
SPJan 25, 2020
Model-Based Machine Learning for Joint Digital Backpropagation and PMD CompensationChristian Häger, Henry D. Pfister, Rick M. Bütler et al.
We propose a model-based machine-learning approach for polarization-multiplexed systems by parameterizing the split-step method for the Manakov-PMD equation. This approach performs hardware-friendly DBP and distributed PMD compensation with performance close to the PMD-free case.
ITJan 21, 2020
Pruning Neural Belief Propagation DecodersAndreas Buchberger, Christian Häger, Henry D. Pfister et al.
We consider near maximum-likelihood (ML) decoding of short linear block codes based on neural belief propagation (BP) decoding recently introduced by Nachmani et al.. While this method significantly outperforms conventional BP decoding, the underlying parity-check matrix may still limit the overall performance. In this paper, we introduce a method to tailor an overcomplete parity-check matrix to (neural) BP decoding using machine learning. We consider the weights in the Tanner graph as an indication of the importance of the connected check nodes (CNs) to decoding and use them to prune unimportant CNs. As the pruning is not tied over iterations, the final decoder uses a different parity-check matrix in each iteration. For Reed-Muller and short low-density parity-check codes, we achieve performance within 0.27 dB and 1.5 dB of the ML performance while reducing the complexity of the decoder.
ITJun 11, 2019
Reinforcement Learning for Channel Coding: Learned Bit-Flipping DecodingFabrizio Carpi, Christian Häger, Marco Martalò et al.
In this paper, we use reinforcement learning to find effective decoding strategies for binary linear codes. We start by reviewing several iterative decoding algorithms that involve a decision-making process at each step, including bit-flipping (BF) decoding, residual belief propagation, and anchor decoding. We then illustrate how such algorithms can be mapped to Markov decision processes allowing for data-driven learning of optimal decision strategies, rather than basing decisions on heuristics or intuition. As a case study, we consider BF decoding for both the binary symmetric and additive white Gaussian noise channel. Our results show that learned BF decoders can offer a range of performance-complexity trade-offs for the considered Reed-Muller and BCH codes, and achieve near-optimal performance in some cases. We also demonstrate learning convergence speed-ups when biasing the learning process towards correct decoding decisions, as opposed to relying only on random explorations and past knowledge.
SPApr 22, 2019
Revisiting Multi-Step Nonlinearity Compensation with Machine LearningChristian Häger, Henry D. Pfister, Rick M. Bütler et al.
For the efficient compensation of fiber nonlinearity, one of the guiding principles appears to be: fewer steps are better and more efficient. We challenge this assumption and show that carefully designed multi-step approaches can lead to better performance-complexity trade-offs than their few-step counterparts.
ITJan 24, 2019
Learned Belief-Propagation Decoding with Simple Scaling and SNR AdaptationMengke Lian, Fabrizio Carpi, Christian Häger et al.
We consider the weighted belief-propagation (WBP) decoder recently proposed by Nachmani et al. where different weights are introduced for each Tanner graph edge and optimized using machine learning techniques. Our focus is on simple-scaling models that use the same weights across certain edges to reduce the storage and computational burden. The main contribution is to show that simple scaling with few parameters often achieves the same gain as the full parameterization. Moreover, several training improvements for WBP are proposed. For example, it is shown that minimizing average binary cross-entropy is suboptimal in general in terms of bit error rate (BER) and a new "soft-BER" loss is proposed which can lead to better performance. We also investigate parameter adapter networks (PANs) that learn the relation between the signal-to-noise ratio and the WBP parameters. As an example, for the (32,16) Reed-Muller code with a highly redundant parity-check matrix, training a PAN with soft-BER loss gives near-maximum-likelihood performance assuming simple scaling with only three parameters.
ITJan 22, 2019
What Can Machine Learning Teach Us about Communications?Mengke Lian, Christian Häger, Henry D. Pfister
Rapid improvements in machine learning over the past decade are beginning to have far-reaching effects. For communications, engineers with limited domain expertise can now use off-the-shelf learning packages to design high-performance systems based on simulations. Prior to the current revolution in machine learning, the majority of communication engineers were quite aware that system parameters (such as filter coefficients) could be learned using stochastic gradient descent. It was not at all clear, however, that more complicated parts of the system architecture could be learned as well. In this paper, we discuss the application of machine-learning techniques to two communications problems and focus on what can be learned from the resulting systems. We were pleasantly surprised that the observed gains in one example have a simple explanation that only became clear in hindsight. In essence, deep learning discovered a simple and effective strategy that had not been considered earlier.
ITJul 4, 2018
Wideband Time-Domain Digital Backpropagation via Subband Processing and Deep LearningChristian Häger, Henry D. Pfister
We propose a low-complexity sub-banded DSP architecture for digital backpropagation where the walk-off effect is compensated using simple delay elements. For a simulated 96-Gbaud signal and 2500 km optical link, our method achieves a 2.8 dB SNR improvement over linear equalization.
ITJun 19, 2018
ASIC Implementation of Time-Domain Digital Backpropagation with Deep-Learned Chromatic Dispersion FiltersChristoffer Fougstedt, Christian Häger, Lars Svensson et al.
We consider time-domain digital backpropagation with chromatic dispersion filters jointly optimized and quantized using machine-learning techniques. Compared to the baseline implementations, we show improved BER performance and >40% power dissipation reductions in 28-nm CMOS.
ITApr 9, 2018
Deep Learning of the Nonlinear Schrödinger Equation in Fiber-Optic CommunicationsChristian Häger, Henry D. Pfister
An important problem in fiber-optic communications is to invert the nonlinear Schrödinger equation in real time to reverse the deterministic effects of the channel. Interestingly, the popular split-step Fourier method (SSFM) leads to a computation graph that is reminiscent of a deep neural network. This observation allows one to leverage tools from machine learning to reduce complexity. In particular, the main disadvantage of the SSFM is that its complexity using M steps is at least M times larger than a linear equalizer. This is because the linear SSFM operator is a dense matrix. In previous work, truncation methods such as frequency sampling, wavelets, or least-squares have been used to obtain "cheaper" operators that can be implemented using filters. However, a large number of filter taps are typically required to limit truncation errors. For example, Ip and Kahn showed that for a 10 Gbaud signal and 2000 km optical link, a truncated SSFM with 25 steps would require 70-tap filters in each step and 100 times more operations than linear equalization. We find that, by jointly optimizing all filters with deep learning, the complexity can be reduced significantly for similar accuracy. Using optimized 5-tap and 3-tap filters in an alternating fashion, one requires only around 2-6 times the complexity of linear equalization, depending on the implementation.
ITOct 17, 2017
Nonlinear Interference Mitigation via Deep Neural NetworksChristian Häger, Henry D. Pfister
A neural-network-based approach is presented to efficiently implement digital backpropagation (DBP). For a 32x100 km fiber-optic link, the resulting "learned" DBP significantly reduces the complexity compared to conventional DBP implementations.