CARL: Causality-guided Architecture Representation Learning for an Interpretable Performance Predictor
This addresses the issue of distribution shift in NAS performance predictors for researchers and practitioners, representing a novel method for a known bottleneck rather than incremental.
The paper tackles the problem of poor generalization in neural architecture search performance predictors by proposing CARL, a causality-guided method that separates critical and redundant features, achieving state-of-the-art accuracy such as 97.67% top-1 on CIFAR-10 with DARTS.
Performance predictors have emerged as a promising method to accelerate the evaluation stage of neural architecture search (NAS). These predictors estimate the performance of unseen architectures by learning from the correlation between a small set of trained architectures and their performance. However, most existing predictors ignore the inherent distribution shift between limited training samples and diverse test samples. Hence, they tend to learn spurious correlations as shortcuts to predictions, leading to poor generalization. To address this, we propose a Causality-guided Architecture Representation Learning (CARL) method aiming to separate critical (causal) and redundant (non-causal) features of architectures for generalizable architecture performance prediction. Specifically, we employ a substructure extractor to split the input architecture into critical and redundant substructures in the latent space. Then, we generate multiple interventional samples by pairing critical representations with diverse redundant representations to prioritize critical features. Extensive experiments on five NAS search spaces demonstrate the state-of-the-art accuracy and superior interpretability of CARL. For instance, CARL achieves 97.67% top-1 accuracy on CIFAR-10 using DARTS.