Embedded Implementation of a Deep Learning Smile Detector
This work addresses efficient smile detection for embedded systems, but it is incremental as it compares existing architectures without introducing new methods.
The paper tackled real-time deployment of deep learning for smile detection in low-resource environments, showing that low-complexity neural network architectures achieve nearly equal performance to larger ones with significantly less computation.
In this paper we study the real time deployment of deep learning algorithms in low resource computational environments. As the use case, we compare the accuracy and speed of neural networks for smile detection using different neural network architectures and their system level implementation on NVidia Jetson embedded platform. We also propose an asynchronous multithreading scheme for parallelizing the pipeline. Within this framework, we experimentally compare thirteen widely used network topologies. The experiments show that low complexity architectures can achieve almost equal performance as larger ones, with a fraction of computation required.