Xuanle Ren

2papers

2 Papers

38.2ARMar 30
AXON: An Automated Netlist Optimization Framework for High-Speed Adders

Tiantian Yang, Xuanle Ren, Qingdian Wan et al.

Adders are fundamental building blocks in modern digital systems, and their performance, power, and area (PPA) directly impact system efficiency. Contemporary adders typically use parallel-prefix architectures with established PPA trade-offs, but these often fail to deliver globally optimal PPA for specific design goals. Prior work lacks netlist-/cell-level awareness, and general synthesis heuristics are not adder-specific, resulting in suboptimal PPA. To address this, we propose AXON, an automated netlist optimization framework for adders. It performs design space exploration from architectural to netlist level, integrating prefix topology search with standard-cell-aware mapping via a hierarchical approach to quickly converge to near-optimal PPA solutions. We also introduce a hybrid ultra-high-speed adder combining parallel-prefix and Ling architectures to shorten the critical path. Experiments on TSMC 28nm library show AXON improves delay, area-delay product, and energy-delay product by up to 10.3%, 12.6%, and 32.1% respectively, compared to commercial synthesis tools.

ARNov 12, 2020Code
Customizing Trusted AI Accelerators for Efficient Privacy-Preserving Machine Learning

Peichen Xie, Xuanle Ren, Guangyu Sun

The use of trusted hardware has become a promising solution to enable privacy-preserving machine learning. In particular, users can upload their private data and models to a hardware-enforced trusted execution environment (e.g. an enclave in Intel SGX-enabled CPUs) and run machine learning tasks in it with confidentiality and integrity guaranteed. To improve performance, AI accelerators have been widely employed for modern machine learning tasks. However, how to protect privacy on an AI accelerator remains an open question. To address this question, we propose a solution for efficient privacy-preserving machine learning based on an unmodified trusted CPU and a customized trusted AI accelerator. We carefully leverage cryptographic primitives to establish trust and protect the channel between the CPU and the accelerator. As a case study, we demonstrate our solution based on the open-source versatile tensor accelerator. The result of evaluation shows that the proposed solution provides efficient privacy-preserving machine learning at a small design cost and moderate performance overhead.