Balancing Specialization, Generalization, and Compression for Detection and Tracking
This work addresses the challenge of optimizing deep learning models for specific domains without overfitting or losing general capabilities, which is incremental in improving efficiency and performance in computer vision tasks.
The paper tackles the problem of specializing deep detectors and trackers for restricted settings while balancing accuracy, generalization, and compression, resulting in improved detection on VIRAT and CAVIAR datasets with unprecedented compression rates and enhanced tracking performance on the OTB2015 benchmark.
We propose a method for specializing deep detectors and trackers to restricted settings. Our approach is designed with the following goals in mind: (a) Improving accuracy in restricted domains; (b) preventing overfitting to new domains and forgetting of generalized capabilities; (c) aggressive model compression and acceleration. To this end, we propose a novel loss that balances compression and acceleration of a deep learning model vs. loss of generalization capabilities. We apply our method to the existing tracker and detector models. We report detection results on the VIRAT and CAVIAR data sets. These results show our method to offer unprecedented compression rates along with improved detection. We apply our loss for tracker compression at test time, as it processes each video. Our tests on the OTB2015 benchmark show that applying compression during test time actually improves tracking performance.