LGMLMay 27, 2020

Synthetic Petri Dish: A Novel Surrogate Model for Rapid Architecture Search

arXiv:2005.13092v16 citations
Originality Incremental advance
AI Analysis

This work addresses the efficiency problem for researchers and practitioners in machine learning by accelerating NAS, though it is incremental as it builds on existing surrogate model concepts.

The paper tackles the compute-intensive process of Neural Architecture Search (NAS) by proposing the Synthetic Petri Dish model, which evaluates architectural motifs using small networks and few synthetic data samples, resulting in significantly higher accuracy in predicting motif performance, especially with insufficient ground truth data.

Neural Architecture Search (NAS) explores a large space of architectural motifs -- a compute-intensive process that often involves ground-truth evaluation of each motif by instantiating it within a large network, and training and evaluating the network with thousands of domain-specific data samples. Inspired by how biological motifs such as cells are sometimes extracted from their natural environment and studied in an artificial Petri dish setting, this paper proposes the Synthetic Petri Dish model for evaluating architectural motifs. In the Synthetic Petri Dish, architectural motifs are instantiated in very small networks and evaluated using very few learned synthetic data samples (to effectively approximate performance in the full problem). The relative performance of motifs in the Synthetic Petri Dish can substitute for their ground-truth performance, thus accelerating the most expensive step of NAS. Unlike other neural network-based prediction models that parse the structure of the motif to estimate its performance, the Synthetic Petri Dish predicts motif performance by training the actual motif in an artificial setting, thus deriving predictions from its true intrinsic properties. Experiments in this paper demonstrate that the Synthetic Petri Dish can therefore predict the performance of new motifs with significantly higher accuracy, especially when insufficient ground truth data is available. Our hope is that this work can inspire a new research direction in studying the performance of extracted components of models in an alternative controlled setting.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes