Efficient Learned Data Compression via Dual-Stream Feature Decoupling

arXiv:2604.0723978.8Has Code
AI Analysis

This work addresses efficiency bottlenecks in learned data compression, which is crucial for applications requiring high-speed data processing, though it appears incremental as it builds on existing LDC methods.

The paper tackles the challenge of balancing precise probability modeling with system efficiency in learned data compression by proposing a dual-stream architecture that disentangles local and global contexts, achieving state-of-the-art performance in compression ratio and throughput with the lowest latency and memory usage.

While Learned Data Compression (LDC) has achieved superior compression ratios, balancing precise probability modeling with system efficiency remains challenging. Crucially, uniform single-stream architectures struggle to simultaneously capture micro-syntactic and macro-semantic features, necessitating deep serial stacking that exacerbates latency. Compounding this, heterogeneous systems are constrained by device speed mismatches, where throughput is capped by Amdahl's Law due to serial processing. To this end, we propose a Dual-Stream Multi-Scale Decoupler that disentangles local and global contexts to replace deep serial processing with shallow parallel streams, and incorporate a Hierarchical Gated Refiner for adaptive feature refinement and precise probability modeling. Furthermore, we design a Concurrent Stream-Parallel Pipeline, which overcomes systemic bottlenecks to achieve full-pipeline parallelism. Extensive experiments demonstrate that our method achieves state-of-the-art performance in both compression ratio and throughput, while maintaining the lowest latency and memory usage. The code is available at https://github.com/huidong-ma/FADE.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes