Frozen in Time: Parameter-Efficient Time Series Transformers via Reservoir-Induced Feature Expansion and Fixed Random Dynamics
This addresses efficient long-term time-series prediction for applications requiring reduced computational costs, though it is incremental as it builds on existing Transformer and reservoir methods.
The paper tackles the problem of expensive and brittle long-range forecasting in Transformers by introducing FreezeTST, a hybrid model that interleaves frozen random-feature blocks with trainable Transformer layers, which matches or surpasses specialized variants like Informer, Autoformer, and PatchTST on seven benchmarks with substantially lower compute.
Transformers are the de-facto choice for sequence modelling, yet their quadratic self-attention and weak temporal bias can make long-range forecasting both expensive and brittle. We introduce FreezeTST, a lightweight hybrid that interleaves frozen random-feature (reservoir) blocks with standard trainable Transformer layers. The frozen blocks endow the network with rich nonlinear memory at no optimisation cost; the trainable layers learn to query this memory through self-attention. The design cuts trainable parameters and also lowers wall-clock training time, while leaving inference complexity unchanged. On seven standard long-term forecasting benchmarks, FreezeTST consistently matches or surpasses specialised variants such as Informer, Autoformer, and PatchTST; with substantially lower compute. Our results show that embedding reservoir principles within Transformers offers a simple, principled route to efficient long-term time-series prediction.