Towards Efficient Neural Networks On-a-chip: Joint Hardware-Algorithm Approaches
This work addresses hardware efficiency challenges for AI chip implementation, though it appears incremental by building on existing crossbar architectures.
The paper tackled device variation and insufficient interconnections in crossbar hardware for AI algorithms by leveraging statistical redundancy in neural networks, achieving robust and efficient performance on datasets like MNIST and CIFAR-10.
Machine learning algorithms have made significant advances in many applications. However, their hardware implementation on the state-of-the-art platforms still faces several challenges and are limited by various factors, such as memory volume, memory bandwidth and interconnection overhead. The adoption of the crossbar architecture with emerging memory technology partially solves the problem but induces process variation and other concerns. In this paper, we will present novel solutions to two fundamental issues in crossbar implementation of Artificial Intelligence (AI) algorithms: device variation and insufficient interconnections. These solutions are inspired by the statistical properties of algorithms themselves, especially the redundancy in neural network nodes and connections. By Random Sparse Adaptation and pruning the connections following the Small-World model, we demonstrate robust and efficient performance on representative datasets such as MNIST and CIFAR-10. Moreover, we present Continuous Growth and Pruning algorithm for future learning and adaptation on hardware.