DRESS: Dynamic REal-time Sparse Subnets
This work addresses the need for efficient and adaptive neural networks on resource-limited edge devices, representing an incremental improvement over existing sub-network methods.
The paper tackles the problem of deploying deep neural networks on edge devices with varying resource constraints by proposing DRESS, a training algorithm that samples and jointly trains multiple sparse sub-networks from a backbone, achieving significantly higher accuracy than state-of-the-art methods on public vision datasets.
The limited and dynamically varied resources on edge devices motivate us to deploy an optimized deep neural network that can adapt its sub-networks to fit in different resource constraints. However, existing works often build sub-networks through searching different network architectures in a hand-crafted sampling space, which not only can result in a subpar performance but also may cause on-device re-configuration overhead. In this paper, we propose a novel training algorithm, Dynamic REal-time Sparse Subnets (DRESS). DRESS samples multiple sub-networks from the same backbone network through row-based unstructured sparsity, and jointly trains these sub-networks in parallel with weighted loss. DRESS also exploits strategies including parameter reusing and row-based fine-grained sampling for efficient storage consumption and efficient on-device adaptation. Extensive experiments on public vision datasets show that DRESS yields significantly higher accuracy than state-of-the-art sub-networks.