CVMar 25, 2021

AutoLoss-Zero: Searching Loss Functions from Scratch for Generic Tasks

Hao Li, Tianwen Fu, Jifeng Dai, Hongsheng Li, Gao Huang, Xizhou Zhu

arXiv:2103.14026v132 citations

Originality Incremental advance

AI Analysis

This addresses the need for automated loss function design in deep learning, reducing reliance on human expertise and task-specific heuristics, though it is incremental as it builds on prior work in automated network component design.

The paper tackles the problem of automatically designing loss functions for generic tasks with various evaluation metrics, proposing AutoLoss-Zero, a framework that searches loss functions from scratch and achieves performance on par with or superior to existing loss functions in computer vision tasks.

Significant progress has been achieved in automating the design of various components in deep networks. However, the automatic design of loss functions for generic tasks with various evaluation metrics remains under-investigated. Previous works on handcrafting loss functions heavily rely on human expertise, which limits their extendibility. Meanwhile, existing efforts on searching loss functions mainly focus on specific tasks and particular metrics, with task-specific heuristics. Whether such works can be extended to generic tasks is not verified and questionable. In this paper, we propose AutoLoss-Zero, the first general framework for searching loss functions from scratch for generic tasks. Specifically, we design an elementary search space composed only of primitive mathematical operators to accommodate the heterogeneous tasks and evaluation metrics. A variant of the evolutionary algorithm is employed to discover loss functions in the elementary search space. A loss-rejection protocol and a gradient-equivalence-check strategy are developed so as to improve the search efficiency, which are applicable to generic tasks. Extensive experiments on various computer vision tasks demonstrate that our searched loss functions are on par with or superior to existing loss functions, which generalize well to different datasets and networks. Code shall be released.

View on arXiv PDF

Similar