CVFeb 17

LEADER: Lightweight End-to-End Attention-Gated Dual Autoencoder for Robust Minutiae Extraction

arXiv:2602.15493v1h-index: 40
Originality Highly original
AI Analysis

This work addresses the need for efficient and accurate fingerprint recognition by providing a fully end-to-end solution that eliminates separate preprocessing and postprocessing steps, with incremental improvements in performance and computational efficiency.

The paper tackles the problem of end-to-end minutiae extraction from raw fingerprint images by introducing LEADER, a lightweight neural network that achieves state-of-the-art accuracy with a 34% higher F1-score on the NIST SD27 dataset and robust cross-domain generalization.

Minutiae extraction, a fundamental stage in fingerprint recognition, is increasingly shifting toward deep learning. However, truly end-to-end methods that eliminate separate preprocessing and postprocessing steps remain scarce. This paper introduces LEADER (Lightweight End-to-end Attention-gated Dual autoencodER), a neural network that maps raw fingerprint images to minutiae descriptors, including location, direction, and type. The proposed architecture integrates non-maximum suppression and angular decoding to enable complete end-to-end inference using only 0.9M parameters. It employs a novel "Castle-Moat-Rampart" ground-truth encoding and a dual-autoencoder structure, interconnected through an attention-gating mechanism. Experimental evaluations demonstrate state-of-the-art accuracy on plain fingerprints and robust cross-domain generalization to latent impressions. Specifically, LEADER attains a 34% higher F1-score on the NIST SD27 dataset compared to specialized latent minutiae extractors. Sample-level analysis on this challenging benchmark reveals an average rank of 2.07 among all compared methods, with LEADER securing the first-place position in 47% of the samples-more than doubling the frequency of the second-best extractor. The internal representations learned by the model align with established fingerprint domain features, such as segmentation masks, orientation fields, frequency maps, and skeletons. Inference requires 15ms on GPU and 322ms on CPU, outperforming leading commercial software in computational efficiency. The source code and pre-trained weights are publicly released to facilitate reproducibility.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes