State-driven Implicit Modeling for Sparsity and Robustness in Neural Networks
This work addresses training efficiency and model performance for machine learning practitioners, but it is incremental as it builds on existing implicit models.
The paper tackled the problem of expensive training in implicit models by introducing State-driven Implicit Modeling (SIM), which constrains states to match a baseline model, resulting in convex training and improved sparsity and robustness on FashionMNIST and CIFAR-100 datasets.
Implicit models are a general class of learning models that forgo the hierarchical layer structure typical in neural networks and instead define the internal states based on an ``equilibrium'' equation, offering competitive performance and reduced memory consumption. However, training such models usually relies on expensive implicit differentiation for backward propagation. In this work, we present a new approach to training implicit models, called State-driven Implicit Modeling (SIM), where we constrain the internal states and outputs to match that of a baseline model, circumventing costly backward computations. The training problem becomes convex by construction and can be solved in a parallel fashion, thanks to its decomposable structure. We demonstrate how the SIM approach can be applied to significantly improve sparsity (parameter reduction) and robustness of baseline models trained on FashionMNIST and CIFAR-100 datasets.