ARCLLGJan 18, 2024

A Survey on Hardware Accelerators for Large Language Models

arXiv:2401.09890v149 citationsAppl Sci
Originality Synthesis-oriented
AI Analysis

It addresses the problem of computational bottlenecks for researchers and engineers deploying LLMs, but it is incremental as a survey rather than a novel solution.

This paper tackles the computational challenges of scaling Large Language Models by surveying hardware accelerators like GPUs, FPGAs, and custom architectures, providing insights into performance and energy efficiency for real-world deployment.

Large Language Models (LLMs) have emerged as powerful tools for natural language processing tasks, revolutionizing the field with their ability to understand and generate human-like text. As the demand for more sophisticated LLMs continues to grow, there is a pressing need to address the computational challenges associated with their scale and complexity. This paper presents a comprehensive survey on hardware accelerators designed to enhance the performance and energy efficiency of Large Language Models. By examining a diverse range of accelerators, including GPUs, FPGAs, and custom-designed architectures, we explore the landscape of hardware solutions tailored to meet the unique computational demands of LLMs. The survey encompasses an in-depth analysis of architecture, performance metrics, and energy efficiency considerations, providing valuable insights for researchers, engineers, and decision-makers aiming to optimize the deployment of LLMs in real-world applications.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes