A Survey on Dynamic Neural Networks: from Computer Vision to Multi-modal Sensor Fusion
This survey addresses the fragmented literature on Dynamic Neural Networks, providing a taxonomy and curated repository to aid researchers and practitioners in Computer Vision and sensor fusion, but it is incremental as it synthesizes existing work without introducing new methods.
The paper presents a comprehensive survey that synthesizes and unifies existing research on Dynamic Neural Networks in Computer Vision, highlighting their role in adapting computations to input complexity for model compression, and extends the discussion to multi-modal sensor fusion for benefits like adaptivity and noise reduction.
Model compression is essential in the deployment of large Computer Vision models on embedded devices. However, static optimization techniques (e.g. pruning, quantization, etc.) neglect the fact that different inputs have different complexities, thus requiring different amount of computations. Dynamic Neural Networks allow to condition the number of computations to the specific input. The current literature on the topic is very extensive and fragmented. We present a comprehensive survey that synthesizes and unifies existing Dynamic Neural Networks research in the context of Computer Vision. Additionally, we provide a logical taxonomy based on which component of the network is adaptive: the output, the computation graph or the input. Furthermore, we argue that Dynamic Neural Networks are particularly beneficial in the context of Sensor Fusion for better adaptivity, noise reduction and information prioritization. We present preliminary works in this direction. We complement this survey with a curated repository listing all the surveyed papers, each with a brief summary of the solution and the code base when available: https://github.com/DTU-PAS/awesome-dynn-for-cv .