Linear Diffusion Networks
This addresses the limitations of recurrent and transformer models for sequence modeling, though it appears incremental as it builds on existing diffusion concepts.
The authors tackled the problem of sequential data processing by introducing Linear Diffusion Networks (LDNs), which reinterpret it as a diffusion process, resulting in competitive performance on ImageNet and LRA benchmarks.
We present Linear Diffusion Networks (LDNs), a novel architecture that reinterprets sequential data processing as a unified diffusion process. Our model integrates adaptive diffusion modules with localized nonlinear updates and a diffusion-inspired attention mechanism. This design enables efficient global information propagation while preserving fine-grained temporal details. LDN overcomes the limitations of conventional recurrent and transformer models by allowing full parallelization across time steps and supporting robust multi-scale temporal representations. Experiments on benchmark sequence modeling tasks demonstrate that LDN delivers competitive performance across ImageNet and LRA tasks.