LGCVAug 12, 2020

FATNN: Fast and Accurate Ternary Neural Networks

arXiv:2008.05101v423 citations
AI Analysis

This work addresses efficiency and accuracy issues for deploying TNNs in real applications, representing an incremental improvement.

The paper tackled the challenges of ternary neural networks (TNNs) by reducing computational complexity by a factor of 2 and designing a quantization algorithm, resulting in FATNN surpassing state-of-the-art accuracy in image classification and providing speedup benchmarks.

Ternary Neural Networks (TNNs) have received much attention due to being potentially orders of magnitude faster in inference, as well as more power efficient, than full-precision counterparts. However, 2 bits are required to encode the ternary representation with only 3 quantization levels leveraged. As a result, conventional TNNs have similar memory consumption and speed compared with the standard 2-bit models, but have worse representational capability. Moreover, there is still a significant gap in accuracy between TNNs and full-precision networks, hampering their deployment to real applications. To tackle these two challenges, in this work, we first show that, under some mild constraints, computational complexity of the ternary inner product can be reduced by a factor of 2. Second, to mitigate the performance gap, we elaborately design an implementation-dependent ternary quantization algorithm. The proposed framework is termed Fast and Accurate Ternary Neural Networks (FATNN). Experiments on image classification demonstrate that our FATNN surpasses the state-of-the-arts by a significant margin in accuracy. More importantly, speedup evaluation compared with various precisions is analyzed on several platforms, which serves as a strong benchmark for further research.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes