LGCVMLFeb 10, 2021

Input Similarity from the Neural Network Perspective

arXiv:2102.05262v1100 citations
AI Analysis

This work addresses the challenge of understanding how neural networks perceive input similarity, which is incremental but useful for statistical analysis and optimization in machine learning.

The paper tackles the problem of defining input similarity from a neural network perspective, explaining an auto-denoising phenomenon in image registration where networks achieve high accuracy despite noisy labels, and proposes a method to estimate sample density and improve training speed.

We first exhibit a multimodal image registration task, for which a neural network trained on a dataset with noisy labels reaches almost perfect accuracy, far beyond noise variance. This surprising auto-denoising phenomenon can be explained as a noise averaging effect over the labels of similar input examples. This effect theoretically grows with the number of similar examples; the question is then to define and estimate the similarity of examples. We express a proper definition of similarity, from the neural network perspective, i.e. we quantify how undissociable two inputs $A$ and $B$ are, taking a machine learning viewpoint: how much a parameter variation designed to change the output for $A$ would impact the output for $B$ as well? We study the mathematical properties of this similarity measure, and show how to use it on a trained network to estimate sample density, in low complexity, enabling new types of statistical analysis for neural networks. We analyze data by retrieving samples perceived as similar by the network, and are able to quantify the denoising effect without requiring true labels. We also propose, during training, to enforce that examples known to be similar should also be seen as similar by the network, and notice speed-up training effects for certain datasets.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes