ARJun 27, 2023
A Survey on Deep Learning Hardware Accelerators for Heterogeneous HPC PlatformsCristina Silvano, Daniele Ielmini, Fabrizio Ferrandi et al.
Recent trends in deep learning (DL) have made hardware accelerators essential for various high-performance computing (HPC) applications, including image classification, computer vision, and speech recognition. This survey summarizes and classifies the most recent developments in DL accelerators, focusing on their role in meeting the performance demands of HPC applications. We explore cutting-edge approaches to DL acceleration, covering not only GPU- and TPU-based platforms but also specialized hardware such as FPGA- and ASIC-based accelerators, Neural Processing Units, open hardware RISC-V-based accelerators, and co-processors. This survey also describes accelerators leveraging emerging memory technologies and computing paradigms, including 3D-stacked Processor-In-Memory, non-volatile memories like Resistive RAM and Phase Change Memories used for in-memory computing, as well as Neuromorphic Processing Units, and Multi-Chip Module-based accelerators. Furthermore, we provide insights into emerging quantum-based accelerators and photonics. Finally, this survey categorizes the most influential architectures and technologies from recent years, offering readers a comprehensive perspective on the rapidly evolving field of deep learning acceleration.
ARNov 29, 2023
A Survey on Design Methodologies for Accelerating Deep Learning on Heterogeneous ArchitecturesSerena Curzel, Fabrizio Ferrandi, Leandro Fiorin et al.
Given their increasing size and complexity, the need for efficient execution of deep neural networks has become increasingly pressing in the design of heterogeneous High-Performance Computing (HPC) and edge platforms, leading to a wide variety of proposals for specialized deep learning architectures and hardware accelerators. The design of such architectures and accelerators requires a multidisciplinary approach combining expertise from several areas, from machine learning to computer architecture, low-level hardware design, and approximate computing. Several methodologies and tools have been proposed to improve the process of designing accelerators for deep learning, aimed at maximizing parallelism and minimizing data movement to achieve high performance and energy efficiency. This paper critically reviews influential tools and design methodologies for Deep Learning accelerators, offering a wide perspective in this rapidly evolving field. This work complements surveys on architectures and accelerators by covering hardware-software co-design, automated synthesis, domain-specific compilers, design space exploration, modeling, and simulation, providing insights into technical challenges and open research directions.
25.4ETMar 27
A new approach to rating scale definition with quantum-inspired optimizationPatrizio Spada, Laura Cappelli, Francesca Cibrario et al.
In finance, assessing the creditworthiness of loan applicants requires lenders to cluster borrowers using rating scales. Financial institutions must define the scales in compliance with strict institutional constraints, resulting in solving a complex combinatorial constrained optimization problem. This contribution studies how to solve this problem using a Quadratic Unconstrained Binary Optimization (QUBO) model, a formulation suitable for quantum hardware. We validate this approach by testing the proposed formulation with classical heuristics. We then benchmark the results against a brute-force method to demonstrate consistent solution quality and highlight the framework's suitability for more complex scenarios.