NECVIVSep 8, 2019

A Resource-Efficient Embedded Iris Recognition System Using Fully Convolutional Networks

arXiv:1909.03385v111 citations
AI Analysis

This work addresses the challenge of deploying accurate iris recognition on resource-constrained mobile and embedded devices, representing an incremental improvement with specific hardware optimizations.

The paper tackles the problem of computationally demanding Fully Convolutional Networks (FCN) for iris recognition in embedded systems by proposing a resource-efficient, end-to-end flow with a three-step SW/HW co-design methodology, achieving a 50X reduction in FLOPs per inference and up to 8.3X speedup on an FPGA platform.

Applications of Fully Convolutional Networks (FCN) in iris segmentation have shown promising advances. For mobile and embedded systems, a significant challenge is that the proposed FCN architectures are extremely computationally demanding. In this article, we propose a resource-efficient, end-to-end iris recognition flow, which consists of FCN-based segmentation, contour fitting, followed by Daugman normalization and encoding. To attain accurate and efficient FCN models, we propose a three-step SW/HW co-design methodology consisting of FCN architectural exploration, precision quantization, and hardware acceleration. In our exploration, we propose multiple FCN models, and in comparison to previous works, our best-performing model requires 50X less FLOPs per inference while achieving a new state-of-the-art segmentation accuracy. Next, we select the most efficient set of models and further reduce their computational complexity through weights and activations quantization using 8-bit dynamic fixed-point (DFP) format. Each model is then incorporated into an end-to-end flow for true recognition performance evaluation. A few of our end-to-end pipelines outperform the previous state-of-the-art on two datasets evaluated. Finally, we propose a novel DFP accelerator and fully demonstrate the SW/HW co-design realization of our flow on an embedded FPGA platform. In comparison with the embedded CPU, our hardware acceleration achieves up to 8.3X speedup for the overall pipeline while using less than 15% of the available FPGA resources. We also provide comparisons between the FPGA system and an embedded GPU showing different benefits and drawbacks for the two platforms.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes