LGCVJul 16, 2023

Revisiting Implicit Models: Sparsity Trade-offs Capability in Weight-tied Model for Vision Tasks

arXiv:2307.08013v2h-index: 14
AI Analysis

This work addresses model inefficiency and optimization instability for researchers and practitioners using implicit models in vision tasks, offering incremental improvements through sparse masks and guidelines.

The paper tackled the inefficiency and instability of implicit models like Deep Equilibrium Models (DEQs) in vision tasks by revisiting weight-tied models, finding them more effective, stable, and efficient, and proposed using sparse masks to improve model capacity with design guidelines for practitioners.

Implicit models such as Deep Equilibrium Models (DEQs) have garnered significant attention in the community for their ability to train infinite layer models with elegant solution-finding procedures and constant memory footprint. However, despite several attempts, these methods are heavily constrained by model inefficiency and optimization instability. Furthermore, fair benchmarking across relevant methods for vision tasks is missing. In this work, we revisit the line of implicit models and trace them back to the original weight-tied models. Surprisingly, we observe that weight-tied models are more effective, stable, as well as efficient on vision tasks, compared to the DEQ variants. Through the lens of these simple-yet-clean weight-tied models, we further study the fundamental limits in the model capacity of such models and propose the use of distinct sparse masks to improve the model capacity. Finally, for practitioners, we offer design guidelines regarding the depth, width, and sparsity selection for weight-tied models, and demonstrate the generalizability of our insights to other learning paradigms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes