CVJun 4, 2025

FingerVeinSyn-5M: A Million-Scale Dataset and Benchmark for Finger Vein Recognition

arXiv:2506.03635v11 citationsh-index: 45Has CodeMM
Originality Incremental advance
AI Analysis

This addresses a data scarcity problem for researchers and practitioners in biometrics and security, enabling more effective deep learning-based finger vein recognition, though it is incremental as it builds on existing synthetic generation methods.

The paper tackles the lack of large-scale public datasets for finger vein recognition by introducing FingerVeinSyn-5M, a synthetic dataset with 5 million samples from 50,000 unique fingers, which enables models pretrained on it and fine-tuned with minimal real data to achieve an average 53.91% performance gain across benchmarks.

A major challenge in finger vein recognition is the lack of large-scale public datasets. Existing datasets contain few identities and limited samples per finger, restricting the advancement of deep learning-based methods. To address this, we introduce FVeinSyn, a synthetic generator capable of producing diverse finger vein patterns with rich intra-class variations. Using FVeinSyn, we created FingerVeinSyn-5M -- the largest available finger vein dataset -- containing 5 million samples from 50,000 unique fingers, each with 100 variations including shift, rotation, scale, roll, varying exposure levels, skin scattering blur, optical blur, and motion blur. FingerVeinSyn-5M is also the first to offer fully annotated finger vein images, supporting deep learning applications in this field. Models pretrained on FingerVeinSyn-5M and fine-tuned with minimal real data achieve an average 53.91\% performance gain across multiple benchmarks. The dataset is publicly available at: https://github.com/EvanWang98/FingerVeinSyn-5M.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes