Cross-Attention Transformer for Joint Multi-Receiver Uplink Neural Decoding
This addresses the challenge of reliable multi-receiver decoding in Wi-Fi systems, offering a practical solution with low computational cost, though it is incremental as it builds on existing Transformer and neural decoding methods.
The paper tackles the problem of joint decoding of uplink OFDM signals from multiple access points by proposing a cross-attention Transformer that fuses receiver data without explicit channel estimates, achieving performance that matches or surpasses baselines with perfect channel knowledge in realistic Wi-Fi channels.
We propose a cross-attention Transformer for joint decoding of uplink OFDM signals received by multiple coordinated access points. A shared per-receiver encoder learns time-frequency structure within each received grid, and a token-wise cross-attention module fuses the receivers to produce soft log-likelihood ratios for a standard channel decoder, without requiring explicit per-receiver channel estimates. Trained with a bit-metric objective, the model adapts its fusion to per-receiver reliability, tolerates missing or degraded links, and remains robust when pilots are sparse. Across realistic Wi-Fi channels, it consistently outperforms classical pipelines and strong convolutional baselines, frequently matching (and in some cases surpassing) a powerful baseline that assumes perfect channel knowledge per access point. Despite its expressiveness, the architecture is compact, has low computational cost (low GFLOPs), and achieves low latency on GPUs, making it a practical building block for next-generation Wi-Fi receivers.