Utilizing Excess Resources in Training Neural Networks
This addresses the challenge of enhancing training efficiency and model performance for machine learning practitioners, though it appears incremental as it builds on existing overparameterization techniques.
The authors tackled the problem of improving neural network performance without increasing inference complexity by proposing Kernel Filtering Linear Overparameterization (KFLO), which uses a linear cascade of filtering layers during training and combines them into a single layer at test time, demonstrating advantages on various models and datasets.
In this work, we suggest Kernel Filtering Linear Overparameterization (KFLO), where a linear cascade of filtering layers is used during training to improve network performance in test time. We implement this cascade in a kernel filtering fashion, which prevents the trained architecture from becoming unnecessarily deeper. This also allows using our approach with almost any network architecture and let combining the filtering layers into a single layer in test time. Thus, our approach does not add computational complexity during inference. We demonstrate the advantage of KFLO on various network models and datasets in supervised learning.