Improving Binary Neural Networks through Fully Utilizing Latent Weights
This work addresses the performance gap between floating-point networks and BNNs for efficient deep learning applications, representing an incremental improvement in binary network training.
The paper tackled the problem of Binary Neural Networks (BNNs) not fully utilizing real-valued auxiliary weights during training, and the result was a new method that incorporates these weights as feature extractors, achieving state-of-the-art performance with Top-1 accuracies of 63.4% on ImageNet with ResNet-18 and 67.0% with ResNet-34.
Binary Neural Networks (BNNs) rely on a real-valued auxiliary variable W to help binary training. However, pioneering binary works only use W to accumulate gradient updates during backward propagation, which can not fully exploit its power and may hinder novel advances in BNNs. In this work, we explore the role of W in training besides acting as a latent variable. Notably, we propose to add W into the computation graph, making it perform as a real-valued feature extractor to aid the binary training. We make different attempts on how to utilize the real-valued weights and propose a specialized supervision. Visualization experiments qualitatively verify the effectiveness of our approach in making it easier to distinguish between different categories. Quantitative experiments show that our approach outperforms current state-of-the-arts, further closing the performance gap between floating-point networks and BNNs. Evaluation on ImageNet with ResNet-18 (Top-1 63.4%), ResNet-34 (Top-1 67.0%) achieves new state-of-the-art.