CVNov 7, 2022

Can neural networks extrapolate? Discussion of a theorem by Pedro Domingos

arXiv:2211.03566v111 citationsh-index: 67
Originality Synthesis-oriented
AI Analysis

This work addresses theoretical limitations in understanding neural network interpolation for researchers in machine learning, highlighting incremental insights into their kernel-like behavior.

The paper discusses a theorem by Domingos that suggests neural networks trained via gradient descent approximate kernel machines, limiting their extrapolation capabilities, and extends this result to discrete cases and vector outputs, showing it holds for complex tasks like shape recovery.

Neural networks trained on large datasets by minimizing a loss have become the state-of-the-art approach for resolving data science problems, particularly in computer vision, image processing and natural language processing. In spite of their striking results, our theoretical understanding about how neural networks operate is limited. In particular, what are the interpolation capabilities of trained neural networks? In this paper we discuss a theorem of Domingos stating that "every machine learned by continuous gradient descent is approximately a kernel machine". According to Domingos, this fact leads to conclude that all machines trained on data are mere kernel machines. We first extend Domingo's result in the discrete case and to networks with vector-valued output. We then study its relevance and significance on simple examples. We find that in simple cases, the "neural tangent kernel" arising in Domingos' theorem does provide understanding of the networks' predictions. Furthermore, when the task given to the network grows in complexity, the interpolation capability of the network can be effectively explained by Domingos' theorem, and therefore is limited. We illustrate this fact on a classic perception theory problem: recovering a shape from its boundary.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes