NELGMLJun 19, 2017

Unsure When to Stop? Ask Your Semantic Neighbors

arXiv:1706.06195v118 citations
Originality Incremental advance
AI Analysis

This addresses overfitting risks in iterative learning algorithms like GSGP and SLM, offering an incremental improvement for high-dimensional regression tasks.

The paper tackles the problem of determining when to stop iterative supervised learning to avoid overfitting by proposing semantic stopping criteria based on neighborhood information, which achieve competitive generalization and computational efficiency, with neural networks evolving in under 3 seconds and GP trees in at most 10 seconds on real-world regression datasets.

In iterative supervised learning algorithms it is common to reach a point in the search where no further induction seems to be possible with the available data. If the search is continued beyond this point, the risk of overfitting increases significantly. Following the recent developments in inductive semantic stochastic methods, this paper studies the feasibility of using information gathered from the semantic neighborhood to decide when to stop the search. Two semantic stopping criteria are proposed and experimentally assessed in Geometric Semantic Genetic Programming (GSGP) and in the Semantic Learning Machine (SLM) algorithm (the equivalent algorithm for neural networks). The experiments are performed on real-world high-dimensional regression datasets. The results show that the proposed semantic stopping criteria are able to detect stopping points that result in a competitive generalization for both GSGP and SLM. This approach also yields computationally efficient algorithms as it allows the evolution of neural networks in less than 3 seconds on average, and of GP trees in at most 10 seconds. The usage of the proposed semantic stopping criteria in conjunction with the computation of optimal mutation/learning steps also results in small trees and neural networks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes