Fighting over-fitting with quantization for learning deep neural networks on noisy labels
This addresses the problem of costly data annotation and model deployment for practitioners, though it is incremental as it applies an existing compression technique to a new context.
The paper tackles overfitting in deep neural networks trained on noisy labels by using quantization-aware training as a regularization method, showing that it significantly improves results compared to baselines, including on a controlled test and Facial Action Unit detection.
The rising performance of deep neural networks is often empirically attributed to an increase in the available computational power, which allows complex models to be trained upon large amounts of annotated data. However, increased model complexity leads to costly deployment of modern neural networks, while gathering such amounts of data requires huge costs to avoid label noise. In this work, we study the ability of compression methods to tackle both of these problems at once. We hypothesize that quantization-aware training, by restricting the expressivity of neural networks, behaves as a regularization. Thus, it may help fighting overfitting on noisy data while also allowing for the compression of the model at inference. We first validate this claim on a controlled test with manually introduced label noise. Furthermore, we also test the proposed method on Facial Action Unit detection, where labels are typically noisy due to the subtlety of the task. In all cases, our results suggests that quantization significantly improve the results compared with existing baselines, regularization as well as other compression methods.