Learning Sparse Visual Representations with Leaky Capped Norm Regularizers
This work addresses the need for more effective sparsity-inducing regularizations in visual representation learning, offering a novel approach with proven convergence for 3D recovery, though it is incremental as it builds on non-convex regularization methods.
The paper tackled the problem of learning sparse visual representations by proposing leaky capped norm regularization (LCNR), a non-convex method that imposes strong sparsity with controllable bias, and demonstrated state-of-the-art performance in monocular 3D shape recovery and neural networks, achieving faster convergence and improved results over existing regularizations.
Sparsity inducing regularization is an important part for learning over-complete visual representations. Despite the popularity of $\ell_1$ regularization, in this paper, we investigate the usage of non-convex regularizations in this problem. Our contribution consists of three parts. First, we propose the leaky capped norm regularization (LCNR), which allows model weights below a certain threshold to be regularized more strongly as opposed to those above, therefore imposes strong sparsity and only introduces controllable estimation bias. We propose a majorization-minimization algorithm to optimize the joint objective function. Second, our study over monocular 3D shape recovery and neural networks with LCNR outperforms $\ell_1$ and other non-convex regularizations, achieving state-of-the-art performance and faster convergence. Third, we prove a theoretical global convergence speed on the 3D recovery problem. To the best of our knowledge, this is the first convergence analysis of the 3D recovery problem.