Sustainable Transformer Neural Network Acceleration with Stochastic Photonic Computing
Addresses the high computation and memory demands of transformers for efficient and sustainable inference.
ASTRA is the first silicon-photonic accelerator using stochastic computing for transformers, achieving at least 7.6x speedup and 1.3x lower energy overheads compared to state-of-the-art accelerators.
Transformers achieve state-of-the-art performance in natural language processing, vision, and scientific computing, but demand high computation and memory. To address these challenges, we present ASTRA, the first silicon-photonic accelerator leveraging stochastic computing for transformers. ASTRA employs novel optical stochastic multipliers and unary/analog homodyne accumulation in a crosstalk-minimal organization to efficiently process dynamic tensor computations. Evaluations show at least 7.6x speedup and 1.3x lower energy overheads compared to state-of-the-art accelerators, highlighting ASTRA's potential for efficient, scalable, and sustainable transformer inference.