Weight-Entanglement Meets Gradient-Based Neural Architecture Search
This work bridges two sub-communities in NAS, offering a method to combine gradient-based efficiency with weight-entanglement's memory savings, though it is incremental as it adapts existing techniques rather than introducing a new paradigm.
The paper tackled the incompatibility between weight-entanglement and gradient-based neural architecture search (NAS) by proposing a novel scheme to adapt gradient-based methods for weight-entangled spaces, enabling a comparative analysis that shows this integration preserves memory efficiency while leveraging gradient-based benefits.
Weight sharing is a fundamental concept in neural architecture search (NAS), enabling gradient-based methods to explore cell-based architectural spaces significantly faster than traditional black-box approaches. In parallel, weight-entanglement has emerged as a technique for more intricate parameter sharing amongst macro-architectural spaces. Since weight-entanglement is not directly compatible with gradient-based NAS methods, these two paradigms have largely developed independently in parallel sub-communities. This paper aims to bridge the gap between these sub-communities by proposing a novel scheme to adapt gradient-based methods for weight-entangled spaces. This enables us to conduct an in-depth comparative assessment and analysis of the performance of gradient-based NAS in weight-entangled search spaces. Our findings reveal that this integration of weight-entanglement and gradient-based NAS brings forth the various benefits of gradient-based methods, while preserving the memory efficiency of weight-entangled spaces. The code for our work is openly accessible https://github.com/automl/TangleNAS.