LGSep 7, 2025
An Improved Template for Approximate ComputingMorteza Rezaalipour, Francesco Costa, Marco Biasion et al.
Deploying neural networks on edge devices entails a careful balance between the energy required for inference and the accuracy of the resulting classification. One technique for navigating this tradeoff is approximate computing: the process of reducing energy consumption by slightly reducing the accuracy of arithmetic operators. In this context, we propose a methodology to reduce the area of the small arithmetic operators used in neural networks - i.e., adders and multipliers - via a small loss in accuracy, and show that we improve area savings for the same accuracy loss w.r.t. the state of the art. To achieve our goal, we improve on a boolean rewriting technique recently proposed, called XPAT, where the use of a parametrisable template to rewrite circuits has proved to be highly beneficial. In particular, XPAT was able to produce smaller circuits than comparable approaches while utilising a naive sum of products template structure. In this work, we show that template parameters can act as proxies for chosen metrics and we propose a novel template based on parametrisable product sharing that acts as a close proxy to synthesised area. We demonstrate experimentally that our methodology converges better to low-area solutions and that it can find better approximations than both the original XPAT and two other state-of-the-art approaches.
ARNov 29, 2021
A Graph Deep Learning Framework for High-Level Synthesis Design Space ExplorationLorenzo Ferretti, Andrea Cini, Georgios Zacharopoulos et al.
The design of efficient hardware accelerators for high-throughput data-processing applications, e.g., deep neural networks, is a challenging task in computer architecture design. In this regard, High-Level Synthesis (HLS) emerges as a solution for fast prototyping application-specific hardware starting from a behavioural description of the application computational flow. This Design-Space Exploration (DSE) aims at identifying Pareto optimal synthesis configurations whose exhaustive search is often unfeasible due to the design-space dimensionality and the prohibitive computational cost of the synthesis process. Within this framework, we effectively and efficiently address the design problem by proposing, for the first time in the literature, graph neural networks that jointly predict acceleration performance and hardware costs of a synthesized behavioral specification given optimization directives. The learned model can be used to rapidly approach the Pareto curve by guiding the DSE, taking into account performance and cost estimates. The proposed method outperforms traditional HLS-driven DSE approaches, by accounting for arbitrary length of computer programs and the invariant properties of the input. We propose a novel hybrid control and data flow graph representation that enables training the graph neural network on specifications of different hardware accelerators; the methodology naturally transfers to unseen data-processing applications too. Moreover, we show that our approach achieves prediction accuracy comparable with that of commonly used simulators without having access to analytical models of the HLS compiler and the target FPGA, while being orders of magnitude faster. Finally, the learned representation can be exploited for DSE in unexplored configuration spaces by fine-tuning on a small number of samples from the new target domain.