TF-NAS: Rethinking Three Search Freedoms of Latency-Constrained Differentiable Neural Architecture Search
This work addresses the challenge of efficiently automating neural architecture design for latency-constrained applications, offering an incremental improvement over existing differentiable NAS methods.
The paper tackles the problem of suboptimal accuracy and latency jitters in differentiable neural architecture search (NAS) by proposing TF-NAS, which rethinks operation, depth, and width freedoms to achieve precise latency constraints and good accuracy. On ImageNet, TF-NAS-A achieves 76.9% top-1 accuracy with state-of-the-art results and reduced latency, using only 1.8 days of search time on a single GPU.
With the flourish of differentiable neural architecture search (NAS), automatically searching latency-constrained architectures gives a new perspective to reduce human labor and expertise. However, the searched architectures are usually suboptimal in accuracy and may have large jitters around the target latency. In this paper, we rethink three freedoms of differentiable NAS, i.e. operation-level, depth-level and width-level, and propose a novel method, named Three-Freedom NAS (TF-NAS), to achieve both good classification accuracy and precise latency constraint. For the operation-level, we present a bi-sampling search algorithm to moderate the operation collapse. For the depth-level, we introduce a sink-connecting search space to ensure the mutual exclusion between skip and other candidate operations, as well as eliminate the architecture redundancy. For the width-level, we propose an elasticity-scaling strategy that achieves precise latency constraint in a progressively fine-grained manner. Experiments on ImageNet demonstrate the effectiveness of TF-NAS. Particularly, our searched TF-NAS-A obtains 76.9% top-1 accuracy, achieving state-of-the-art results with less latency. The total search time is only 1.8 days on 1 Titan RTX GPU. Code is available at https://github.com/AberHu/TF-NAS.