CVJun 16, 2020

Cogradient Descent for Bilinear Optimization

arXiv:2006.09142v115 citations
AI Analysis

This addresses a bottleneck in bilinear optimization for machine learning practitioners, offering a novel method to improve training efficiency and performance in tasks with sparsity constraints.

The paper tackles the problem of bilinear optimization where conventional methods treat coupled variables independently, leading to vanishing gradients and insufficient training. They propose the Cogradient Descent algorithm (CoGD), which synchronizes gradient descent for coupled variables, and experiments show it significantly improves state-of-the-art performance in applications like image reconstruction, inpainting, and network pruning.

Conventional learning methods simplify the bilinear model by regarding two intrinsically coupled factors independently, which degrades the optimization procedure. One reason lies in the insufficient training due to the asynchronous gradient descent, which results in vanishing gradients for the coupled variables. In this paper, we introduce a Cogradient Descent algorithm (CoGD) to address the bilinear problem, based on a theoretical framework to coordinate the gradient of hidden variables via a projection function. We solve one variable by considering its coupling relationship with the other, leading to a synchronous gradient descent to facilitate the optimization procedure. Our algorithm is applied to solve problems with one variable under the sparsity constraint, which is widely used in the learning paradigm. We validate our CoGD considering an extensive set of applications including image reconstruction, inpainting, and network pruning. Experiments show that it improves the state-of-the-art by a significant margin.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes