LGAIPFAug 28, 2023

Fast Feedforward Networks

arXiv:2308.14711v28 citationsh-index: 25
Originality Highly original
AI Analysis

This addresses the computational efficiency bottleneck in neural networks for applications requiring fast inference, though it is incremental as it builds on existing feedforward and mixture-of-experts methods.

The paper tackles the problem of high inference cost in feedforward networks by introducing the fast feedforward (FFF) architecture, which achieves up to 220x faster inference than feedforward networks and preserves 94.2% of predictive performance in vision transformers using only 1% of layer neurons.

We break the linear link between the layer size and its inference cost by introducing the fast feedforward (FFF) architecture, a log-time alternative to feedforward networks. We demonstrate that FFFs are up to 220x faster than feedforward networks, up to 6x faster than mixture-of-experts networks, and exhibit better training properties than mixtures of experts thanks to noiseless conditional execution. Pushing FFFs to the limit, we show that they can use as little as 1% of layer neurons for inference in vision transformers while preserving 94.2% of predictive performance.

Code Implementations4 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes