LGAIJun 17, 2022

Debugging using Orthogonal Gradient Descent

arXiv:2206.08489v11 citationsh-index: 43
Originality Incremental advance
AI Analysis

This addresses the issue of debugging neural networks for practitioners, but it is incremental as it builds on existing continual learning methods.

The paper tackles the problem of correcting faulty behavior in trained neural networks without retraining from scratch, demonstrating via experiments on MNIST that it can unlearn undesirable behavior and relearn appropriate behavior while retaining general performance.

In this report we consider the following problem: Given a trained model that is partially faulty, can we correct its behaviour without having to train the model from scratch? In other words, can we ``debug" neural networks similar to how we address bugs in our mathematical models and standard computer code. We base our approach on the hypothesis that debugging can be treated as a two-task continual learning problem. In particular, we employ a modified version of a continual learning algorithm called Orthogonal Gradient Descent (OGD) to demonstrate, via two simple experiments on the MNIST dataset, that we can in-fact \textit{unlearn} the undesirable behaviour while retaining the general performance of the model, and we can additionally \textit{relearn} the appropriate behaviour, both without having to train the model from scratch.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes