LG MLJul 26, 2020

WrapNet: Neural Net Inference with Ultra-Low-Resolution Arithmetic

Renkun Ni, Hong-min Chu, Oscar Castañeda, Ping-yeh Chiang, Christoph Studer, Tom Goldstein

arXiv:2007.13242v19.015 citations

Originality Incremental advance

AI Analysis

This addresses the inference efficiency bottleneck for low-resolution neural networks, offering a potential speedup in hardware implementations.

The paper tackles the problem of high-resolution additions dominating inference complexity in low-resolution neural networks by proposing a method that adapts networks to use 8-bit additions, achieving classification accuracy comparable to 32-bit counterparts.

Low-resolution neural networks represent both weights and activations with few bits, drastically reducing the multiplication complexity. Nonetheless, these products are accumulated using high-resolution (typically 32-bit) additions, an operation that dominates the arithmetic complexity of inference when using extreme quantization (e.g., binary weights). To further optimize inference, we propose a method that adapts neural networks to use low-resolution (8-bit) additions in the accumulators, achieving classification accuracy comparable to their 32-bit counterparts. We achieve resilience to low-resolution accumulation by inserting a cyclic activation layer, as well as an overflow penalty regularizer. We demonstrate the efficacy of our approach on both software and hardware platforms.

View on arXiv PDF

Similar