Anil K. Jain

CV
h-index15
97papers
7,641citations
Novelty48%
AI Score59

97 Papers

CVApr 3, 2022Code
AdaFace: Quality Adaptive Margin for Face Recognition

Minchul Kim, Anil K. Jain, Xiaoming Liu

Recognition in low quality face datasets is challenging because facial attributes are obscured and degraded. Advances in margin-based loss functions have resulted in enhanced discriminability of faces in the embedding space. Further, previous studies have studied the effect of adaptive losses to assign more importance to misclassified (hard) examples. In this work, we introduce another aspect of adaptiveness in the loss function, namely the image quality. We argue that the strategy to emphasize misclassified samples should be adjusted according to their image quality. Specifically, the relative importance of easy or hard samples should be based on the sample's image quality. We propose a new loss function that emphasizes samples of different difficulties based on their image quality. Our method achieves this in the form of an adaptive margin function by approximating the image quality with feature norms. Extensive experiments show that our method, AdaFace, improves the face recognition performance over the state-of-the-art (SoTA) on four datasets (IJB-B, IJB-C, IJB-S and TinyFace). Code and models are released in https://github.com/mk-minchul/AdaFace.

CVOct 25, 2022
Minutiae-Guided Fingerprint Embeddings via Vision Transformers

Steven A. Grosz, Joshua J. Engelsma, Rajeev Ranjan et al.

Minutiae matching has long dominated the field of fingerprint recognition. However, deep networks can be used to extract fixed-length embeddings from fingerprints. To date, the few studies that have explored the use of CNN architectures to extract such embeddings have shown extreme promise. Inspired by these early works, we propose the first use of a Vision Transformer (ViT) to learn a discriminative fixed-length fingerprint embedding. We further demonstrate that by guiding the ViT to focus in on local, minutiae related features, we can boost the recognition performance. Finally, we show that by fusing embeddings learned by CNNs and ViTs we can reach near parity with a commercial state-of-the-art (SOTA) matcher. In particular, we obtain a TAR=94.23% @ FAR=0.1% on the NIST SD 302 public-domain dataset, compared to a SOTA commercial matcher which obtains TAR=96.71% @ FAR=0.1%. Additionally, our fixed-length embeddings can be matched orders of magnitude faster than the commercial system (2.5 million matches/second compared to 50K matches/second). We make our code and models publicly available to encourage further research on this topic: https://github.com/tba.

CVMay 19, 2022
On Demographic Bias in Fingerprint Recognition

Akash Godbole, Steven A. Grosz, Karthik Nandakumar et al.

Fingerprint recognition systems have been deployed globally in numerous applications including personal devices, forensics, law enforcement, banking, and national identity systems. For these systems to be socially acceptable and trustworthy, it is critical that they perform equally well across different demographic groups. In this work, we propose a formal statistical framework to test for the existence of bias (demographic differentials) in fingerprint recognition across four major demographic groups (white male, white female, black male, and black female) for two state-of-the-art (SOTA) fingerprint matchers operating in verification and identification modes. Experiments on two different fingerprint databases (with 15,468 and 1,014 subjects) show that demographic differentials in SOTA fingerprint recognition systems decrease as the matcher accuracy increases and any small bias that may be evident is likely due to certain outlier, low-quality fingerprint images.

CVMay 8, 2022
Fingerprint Template Invertibility: Minutiae vs. Deep Templates

Kanishka P. Wijewardena, Steven A. Grosz, Kai Cao et al.

Much of the success of fingerprint recognition is attributed to minutiae-based fingerprint representation. It was believed that minutiae templates could not be inverted to obtain a high fidelity fingerprint image, but this assumption has been shown to be false. The success of deep learning has resulted in alternative fingerprint representations (embeddings), in the hope that they might offer better recognition accuracy as well as non-invertibility of deep network-based templates. We evaluate whether deep fingerprint templates suffer from the same reconstruction attacks as the minutiae templates. We show that while a deep template can be inverted to produce a fingerprint image that could be matched to its source image, deep templates are more resistant to reconstruction attacks than minutiae templates. In particular, reconstructed fingerprint images from minutiae templates yield a TAR of about 100.0% (98.3%) @ FAR of 0.01% for type-I (type-II) attacks using a state-of-the-art commercial fingerprint matcher, when tested on NIST SD4. The corresponding attack performance for reconstructed fingerprint images from deep templates using the same commercial matcher yields a TAR of less than 1% for both type-I and type-II attacks; however, when the reconstructed images are matched using the same deep network, they achieve a TAR of 85.95% (68.10%) for type-I (type-II) attacks. Furthermore, what is missing from previous fingerprint template inversion studies is an evaluation of the black-box attack performance, which we perform using 3 different state-of-the-art fingerprint matchers. We conclude that fingerprint images generated by inverting minutiae templates are highly susceptible to both white-box and black-box attack evaluations, while fingerprint images generated by deep templates are resistant to black-box evaluations and comparatively less susceptible to white-box evaluations.

CVSep 2, 2022
Learning an Ensemble of Deep Fingerprint Representations

Akash Godbole, Karthik Nandakumar, Anil K. Jain

Deep neural networks (DNNs) have shown incredible promise in learning fixed-length representations from fingerprints. Since the representation learning is often focused on capturing specific prior knowledge (e.g., minutiae), there is no universal representation that comprehensively encapsulates all the discriminatory information available in a fingerprint. While learning an ensemble of representations can mitigate this problem, two critical challenges need to be addressed: (i) How to extract multiple diverse representations from the same fingerprint image? and (ii) How to optimally exploit these representations during the matching process? In this work, we train multiple instances of DeepPrint (a state-of-the-art DNN-based fingerprint encoder) on different transformations of the input image to generate an ensemble of fingerprint embeddings. We also propose a feature fusion technique that distills these multiple representations into a single embedding, which faithfully captures the diversity present in the ensemble without increasing the computational complexity. The proposed approach has been comprehensively evaluated on five databases containing rolled, plain, and latent fingerprints (NIST SD4, NIST SD14, NIST SD27, NIST SD302, and FVC2004 DB2A) and statistically significant improvements in accuracy have been consistently demonstrated across a range of verification as well as closed- and open-set identification settings. The proposed approach serves as a wrapper capable of improving the accuracy of any DNN-based recognition system.

CVNov 25, 2022
AFR-Net: Attention-Driven Fingerprint Recognition Network

Steven A. Grosz, Anil K. Jain

The use of vision transformers (ViT) in computer vision is increasing due to limited inductive biases (e.g., locality, weight sharing, etc.) and increased scalability compared to other deep learning methods. This has led to some initial studies on the use of ViT for biometric recognition, including fingerprint recognition. In this work, we improve on these initial studies for transformers in fingerprint recognition by i.) evaluating additional attention-based architectures, ii.) scaling to larger and more diverse training and evaluation datasets, and iii.) combining the complimentary representations of attention-based and CNN-based embeddings for improved state-of-the-art (SOTA) fingerprint recognition (both authentication and identification). Our combined architecture, AFR-Net (Attention-Driven Fingerprint Recognition Network), outperforms several baseline transformer and CNN-based models, including a SOTA commercial fingerprint system, Verifinger v12.3, across intra-sensor, cross-sensor, and latent to rolled fingerprint matching datasets. Additionally, we propose a realignment strategy using local embeddings extracted from intermediate feature maps within the networks to refine the global embeddings in low certainty situations, which boosts the overall recognition accuracy significantly across each of the models. This realignment strategy requires no additional training and can be applied as a wrapper to any existing deep learning network (including attention-based, CNN-based, or both) to boost its performance.

CVApr 11
Intra-finger Variability of Diffusion-based Latent Fingerprint Generation

Noor Hussein, Anil K. Jain, Karthik Nandakumar

The primary goal of this work is to systematically evaluate the intra-finger variability of synthetic fingerprints (particularly latent prints) generated using a state-of-the-art diffusion model. Specifically, we focus on enhancing the latent style diversity of the generative model by constructing a comprehensive \textit{latent style bank} curated from seven diverse datasets, which enables the precise synthesis of latent prints with over 40 distinct styles encapsulating different surfaces and processing techniques. We also implement a semi-automated framework to understand the integrity of fingerprint ridges and minutiae in the generated impressions. Our analysis indicates that though the generation process largely preserves the identity, a small number of local inconsistencies (addition and removal of minutiae) are introduced, especially when there are poor quality regions in the reference image. Furthermore, mismatch between the reference image and the chosen style embedding that guides the generation process introduces global inconsistencies in the form of hallucinated ridge patterns. These insights highlight the limitations of existing synthetic fingerprint generators and the need to further improve these models to simultaneously enhance both diversity and identity consistency.

CVAug 2, 2024
CLIP4Sketch: Enhancing Sketch to Mugshot Matching through Dataset Augmentation using Diffusion Models

Kushal Kumar Jain, Steve Grosz, Anoop M. Namboodiri et al.

Forensic sketch-to-mugshot matching is a challenging task in face recognition, primarily hindered by the scarcity of annotated forensic sketches and the modality gap between sketches and photographs. To address this, we propose CLIP4Sketch, a novel approach that leverages diffusion models to generate a large and diverse set of sketch images, which helps in enhancing the performance of face recognition systems in sketch-to-mugshot matching. Our method utilizes Denoising Diffusion Probabilistic Models (DDPMs) to generate sketches with explicit control over identity and style. We combine CLIP and Adaface embeddings of a reference mugshot, along with textual descriptions of style, as the conditions to the diffusion model. We demonstrate the efficacy of our approach by generating a comprehensive dataset of sketches corresponding to mugshots and training a face recognition model on our synthetic data. Our results show significant improvements in sketch-to-mugshot matching accuracy over training on an existing, limited amount of real face sketch data, validating the potential of diffusion models in enhancing the performance of face recognition systems across modalities. We also compare our dataset with datasets generated using GAN-based methods to show its superiority.

CVApr 26, 2023
Latent Fingerprint Recognition: Fusion of Local and Global Embeddings

Steven A. Grosz, Anil K. Jain

One of the most challenging problems in fingerprint recognition continues to be establishing the identity of a suspect associated with partial and smudgy fingerprints left at a crime scene (i.e., latent prints or fingermarks). Despite the success of fixed-length embeddings for rolled and slap fingerprint recognition, the features learned for latent fingerprint matching have mostly been limited to local minutiae-based embeddings and have not directly leveraged global representations for matching. In this paper, we combine global embeddings with local embeddings for state-of-the-art latent to rolled matching accuracy with high throughput. The combination of both local and global representations leads to improved recognition accuracy across NIST SD 27, NIST SD 302, MSP, MOLF DB1/DB4, and MOLF DB2/DB4 latent fingerprint datasets for both closed-set (84.11%, 54.36%, 84.35%, 70.43%, 62.86% rank-1 retrieval rate, respectively) and open-set (0.50, 0.74, 0.44, 0.60, 0.68 FNIR at FPIR=0.02, respectively) identification scenarios on a gallery of 100K rolled fingerprints. Not only do we fuse the complimentary representations, we also use the local features to guide the global representations to focus on discriminatory regions in two fingerprint images to be compared. This leads to a multi-stage matching paradigm in which subsets of the retrieved candidate lists for each probe image are passed to subsequent stages for further processing, resulting in a considerable reduction in latency (requiring just 0.068 ms per latent to rolled comparison on a AMD EPYC 7543 32-Core Processor, roughly 15K comparisons per second). Finally, we show the generalizability of the fused representations for improving authentication accuracy across several rolled, plain, and contactless fingerprint datasets.

CVApr 13, 2022
SpoofGAN: Synthetic Fingerprint Spoof Images

Steven A. Grosz, Anil K. Jain

A major limitation to advances in fingerprint spoof detection is the lack of publicly available, large-scale fingerprint spoof datasets, a problem which has been compounded by increased concerns surrounding privacy and security of biometric data. Furthermore, most state-of-the-art spoof detection algorithms rely on deep networks which perform best in the presence of a large amount of training data. This work aims to demonstrate the utility of synthetic (both live and spoof) fingerprints in supplying these algorithms with sufficient data to improve the performance of fingerprint spoof detection algorithms beyond the capabilities when training on a limited amount of publicly available real datasets. First, we provide details of our approach in modifying a state-of-the-art generative architecture to synthesize high quality live and spoof fingerprints. Then, we provide quantitative and qualitative analysis to verify the quality of our synthetic fingerprints in mimicking the distribution of real data samples. We showcase the utility of our synthetic live and spoof fingerprints in training a deep network for fingerprint spoof detection, which dramatically boosts the performance across three different evaluation datasets compared to an identical model trained on real data alone. Finally, we demonstrate that only 25% of the original (real) dataset is required to obtain similar detection performance when augmenting the training dataset with synthetic data.

CVMay 23
CoDA: Color Distribution Probing for Efficient and Generalizable AI-Generated Image Detection

Zexi Jia, Zhiqiang Yuan, Xiaoyue Duan et al.

AI-generated image detection faces a persistent trade-off between generalization and efficiency: lightweight artifact-based methods often degrade on unseen generators or domains, whereas more robust large-scale models are computationally expensive. Meanwhile, existing benchmarks mainly focus on cross-model evaluation in photorealistic settings, leaving cross-domain robustness underexplored. To address this gap, we introduce FakeForm, a large-scale benchmark with approximately 370,000 images across 62 diverse domains for both cross-model and cross-domain evaluation. Motivated by this broader setting, we revisit color-distribution probing as an efficient complementary cue for AI-generated image detection. We observe that, especially for photographic content, real photographs tend to exhibit smoother and more stable color patterns, whereas synthetic images often show characteristic color imbalances introduced by neural generation. Based on this observation, we propose CoDA, a compact 1.48M-parameter detector built on a Noise-Quantization Probe, together with a theoretical analysis linking probe responses to color non-uniformity. Experiments show that CoDA achieves state-of-the-art performance on standard benchmarks and the best results on the challenging cross-domain evaluation of FakeForm, while remaining highly competitive in cross-model photorealistic settings. These results suggest that persistent generative artifacts can provide a practical foundation for efficient and robust AI-generated image detection. The models and FakeForm benchmark will be made publicly available.

CVMay 8Code
Towards Billion-scale Multi-modal Biometric Search

Arka Koner, Chetan S. Naik, Lokesh Kurre et al.

Searching a multi-biometric database of a billion records for a country-level identity system requires pushing the limits of all aspects of a biometric system, including acquisition, preprocessing, feature extraction, accuracy, matching speed, presentation attack detection, and handling of special cases (e.g., missing finger digits). This is the first paper that gives insights into such a large-scale multimodal biometric search system, called Bharat ABIS, based on open-source architectures. The end-to-end pipeline of Bharat ABIS processes fingerprint, face and iris modalities through modality-specific stages of preprocessing (segmentation), quality assessment, presentation attack detection, and learning an embedding (feature extraction), producing a concatenated template of 13.5KB per person. We present a detailed analysis of the modalities and how they are integrated to create an efficient and effective solution for 1:N search (de-duplication). Evaluations on a demographically stratified gallery of 220 million identities, randomly sampled from 1.55 billion records in India's Aadhaar database, yield an FNIR of 0.3% at an FPIR of 0.5%, for adult probes (over 18 years). We also compare the performance of Bharat ABIS against three state-of-the-art COTS systems on a 20M gallery. Our system achieves a throughput of 100 searches per second on a gallery of 40M on a single server (8xNvidia H100 GPUs, 2TB RAM).

CVAug 29, 2022
Synthetic Latent Fingerprint Generator

Andre Brasil Vieira Wyzykowski, Anil K. Jain

Given a full fingerprint image (rolled or slap), we present CycleGAN models to generate multiple latent impressions of the same identity as the full print. Our models can control the degree of distortion, noise, blurriness and occlusion in the generated latent print images to obtain Good, Bad and Ugly latent image categories as introduced in the NIST SD27 latent database. The contributions of our work are twofold: (i) demonstrate the similarity of synthetically generated latent fingerprint images to crime scene latents in NIST SD27 and MSP databases as evaluated by the NIST NFIQ 2 quality measure and ROC curves obtained by a SOTA fingerprint matcher, and (ii) use of synthetic latents to augment small-size latent training databases in the public domain to improve the performance of DeepPrint, a SOTA fingerprint matcher designed for rolled to rolled fingerprint matching on three latent databases (NIST SD27, NIST SD302, and IIITD-SLF). As an example, with synthetic latent data augmentation, the Rank-1 retrieval performance of DeepPrint is improved from 15.50% to 29.07% on challenging NIST SD27 latent database. Our approach for generating synthetic latent fingerprints can be used to improve the recognition performance of any latent matcher and its individual components (e.g., enhancement, segmentation and feature extraction).

CVNov 20, 2023
AdvGen: Physical Adversarial Attack on Face Presentation Attack Detection Systems

Sai Amrit Patnaik, Shivali Chansoriya, Anil K. Jain et al.

Evaluating the risk level of adversarial images is essential for safely deploying face authentication models in the real world. Popular approaches for physical-world attacks, such as print or replay attacks, suffer from some limitations, like including physical and geometrical artifacts. Recently, adversarial attacks have gained attraction, which try to digitally deceive the learning strategy of a recognition system using slight modifications to the captured image. While most previous research assumes that the adversarial image could be digitally fed into the authentication systems, this is not always the case for systems deployed in the real world. This paper demonstrates the vulnerability of face authentication systems to adversarial images in physical world scenarios. We propose AdvGen, an automated Generative Adversarial Network, to simulate print and replay attacks and generate adversarial images that can fool state-of-the-art PADs in a physical domain attack setting. Using this attack strategy, the attack success rate goes up to 82.01%. We test AdvGen extensively on four datasets and ten state-of-the-art PADs. We also demonstrate the effectiveness of our attack by conducting experiments in a realistic, physical environment.

CVJun 1, 2023
Accelerated Fingerprint Enhancement: A GPU-Optimized Mixed Architecture Approach

André Brasil Vieira Wyzykowski, Anil K. Jain

This document presents a preliminary approach to latent fingerprint enhancement, fundamentally designed around a mixed Unet architecture. It combines the capabilities of the Resnet-101 network and Unet encoder, aiming to form a potentially powerful composite. This combination, enhanced with attention mechanisms and forward skip connections, is intended to optimize the enhancement of ridge and minutiae features in fingerprints. One innovative element of this approach includes a novel Fingerprint Enhancement Gabor layer, specifically designed for GPU computations. This illustrates how modern computational resources might be harnessed to expedite enhancement. Given its potential functionality as either a CNN or Transformer layer, this Gabor layer could offer improved agility and processing speed to the system. However, it is important to note that this approach is still in the early stages of development and has not yet been fully validated through rigorous experiments. As such, it may require additional time and testing to establish its robustness and usability in the field of latent fingerprint enhancement. This includes improvements in processing speed, enhancement adaptability with distinct latent fingerprint types, and full validation in experimental approaches such as open-set (identification 1:N) and open-set validation, fingerprint quality evaluation, among others.

CVMar 15
Continual Few-shot Adaptation for Synthetic Fingerprint Detection

Joseph Geo Benjamin, Anil K. Jain, Karthik Nandakumar

The quality and realism of synthetically generated fingerprint images have increased significantly over the past decade fueled by advancements in generative artificial intelligence (GenAI). This has exacerbated the vulnerability of fingerprint recognition systems to data injection attacks, where synthetic fingerprints are maliciously inserted during enrollment or authentication. Hence, there is an urgent need for methods to detect if a fingerprint image is real or synthetic. While it is straightforward to train deep neural network (DNN) models to classify images as real or synthetic, often such DNN models overfit the training data and fail to generalize well when applied to synthetic fingerprints generated using unseen GenAI models. In this work, we formulate synthetic fingerprint detection as a continual few-shot adaptation problem, where the objective is to rapidly evolve a base detector to identify new types of synthetic data. To enable continual few-shot adaptation, we employ a combination of binary cross-entropy and supervised contrastive (applied to the feature representation) losses and replay a few samples from previously known styles during fine-tuning to mitigate catastrophic forgetting. Experiments based on several DNN backbones (as feature extractors) and a variety of real and synthetic fingerprint datasets indicate that the proposed approach achieves a good trade-off between fast adaptation for detecting unseen synthetic styles and forgetting of known styles.

CVDec 14, 2022
Child PalmID: Contactless Palmprint Recognition

Anil K. Jain, Akash Godbole, Anjoo Bhatnagar et al.

Developing and least developed countries face the dire challenge of ensuring that each child in their country receives required doses of vaccination, adequate nutrition and proper medication. International agencies such as UNICEF, WHO and WFP, among other organizations, strive to find innovative solutions to determine which child has received the benefits and which have not. Biometric recognition systems have been sought out to help solve this problem. To that end, this report establishes a baseline accuracy of a commercial contactless palmprint recognition system that may be deployed for recognizing children in the age group of one to five years old. On a database of contactless palmprint images of one thousand unique palms from 500 children, we establish SOTA authentication accuracy of 90.85% @ FAR of 0.01%, rank-1 identification accuracy of 99.0% (closed set), and FPIR=0.01 @ FNIR=0.3 for open-set identification using PalmMobile SDK from Armatura.

CVMar 27, 2020Code
HERS: Homomorphically Encrypted Representation Search

Joshua J. Engelsma, Anil K. Jain, Vishnu Naresh Boddeti

We present a method to search for a probe (or query) image representation against a large gallery in the encrypted domain. We require that the probe and gallery images be represented in terms of a fixed-length representation, which is typical for representations obtained from learned networks. Our encryption scheme is agnostic to how the fixed-length representation is obtained and can therefore be applied to any fixed-length representation in any application domain. Our method, dubbed HERS (Homomorphically Encrypted Representation Search), operates by (i) compressing the representation towards its estimated intrinsic dimensionality with minimal loss of accuracy (ii) encrypting the compressed representation using the proposed fully homomorphic encryption scheme, and (iii) efficiently searching against a gallery of encrypted representations directly in the encrypted domain, without decrypting them. Numerical results on large galleries of face, fingerprint, and object datasets such as ImageNet show that, for the first time, accurate and fast image search within the encrypted domain is feasible at scale (500 seconds; $275\times$ speed up over state-of-the-art for encrypted search against a gallery of 100 million). Code is available at https://github.com/human-analysis/hers-encrypted-image-search

CVSep 2, 2019Code
White-Box Evaluation of Fingerprint Matchers: Robustness to Minutiae Perturbations

Steven A. Grosz, Joshua J. Engelsma, Nicholas G. Paulter et al.

Prevailing evaluations of fingerprint recognition systems have been performed as end-to-end black-box tests of fingerprint identification or authentication accuracy. However, performance of the end-to-end system is subject to errors arising in any of its constituent modules, including: fingerprint scanning, preprocessing, feature extraction, and matching. Conversely, white-box evaluations provide a more granular evaluation by studying the individual sub-components of a system. While a few studies have conducted stand-alone evaluations of the fingerprint reader and feature extraction modules of fingerprint recognition systems, little work has been devoted towards white-box evaluations of the fingerprint matching module. We report results of a controlled, white-box evaluation of one open-source and two commercial-off-the-shelf (COTS) minutiae-based matchers in terms of their robustness against controlled perturbations (random noise and non-linear distortions) introduced into the input minutiae feature sets. Our white-box evaluations reveal that the performance of fingerprint minutiae matchers are more susceptible to non-linear distortion and missing minutiae than spurious minutiae and small positional displacements of the minutiae locations.

CVJan 13, 2019Code
Generalizing Fingerprint Spoof Detector: Learning a One-Class Classifier

Joshua J. Engelsma, Anil K. Jain

Prevailing fingerprint recognition systems are vulnerable to spoof attacks. To mitigate these attacks, automated spoof detectors are trained to distinguish a set of live or bona fide fingerprints from a set of known spoof fingerprints. Despite their success, spoof detectors remain vulnerable when exposed to attacks from spoofs made with materials not seen during training of the detector. To alleviate this shortcoming, we approach spoof detection as a one-class classification problem. The goal is to train a spoof detector on only the live fingerprints such that once the concept of "live" has been learned, spoofs of any material can be rejected. We accomplish this through training multiple generative adversarial networks (GANS) on live fingerprint images acquired with the open source, dual-camera, 1900 ppi RaspiReader fingerprint reader. Our experimental results, conducted on 5.5K spoof images (from 12 materials) and 11.8K live images show that the proposed approach improves the cross-material spoof detection performance over state-of-the-art one-class and binary class spoof detectors on 11 of 12 testing materials and 7 of 12 testing materials, respectively.

CVMay 2, 2018Code
Altered Fingerprints: Detection and Localization

Elham Tabassi, Tarang Chugh, Debayan Deb et al.

Fingerprint alteration, also referred to as obfuscation presentation attack, is to intentionally tamper or damage the real friction ridge patterns to avoid identification by an AFIS. This paper proposes a method for detection and localization of fingerprint alterations. Our main contributions are: (i) design and train CNN models on fingerprint images and minutiae-centered local patches in the image to detect and localize regions of fingerprint alterations, and (ii) train a Generative Adversarial Network (GAN) to synthesize altered fingerprints whose characteristics are similar to true altered fingerprints. A successfully trained GAN can alleviate the limited availability of altered fingerprint images for research. A database of 4,815 altered fingerprints from 270 subjects, and an equal number of rolled fingerprint images are used to train and test our models. The proposed approach achieves a True Detection Rate (TDR) of 99.24% at a False Detection Rate (FDR) of 2%, outperforming published results. The synthetically generated altered fingerprint dataset will be open-sourced.

CVApr 23, 2018Code
Fingerprint Match in Box

Joshua J. Engelsma, Kai Cao, Anil K. Jain

We open source fingerprint Match in Box, a complete end-to-end fingerprint recognition system embedded within a 4 inch cube. Match in Box stands in contrast to a typical bulky and expensive proprietary fingerprint recognition system which requires sending a fingerprint image to an external host for processing and subsequent spoof detection and matching. In particular, Match in Box is a first of a kind, portable, low-cost, and easy-to-assemble fingerprint reader with an enrollment database embedded within the reader's memory and open source fingerprint spoof detector, feature extractor, and matcher all running on the reader's internal vision processing unit (VPU). An onboard touch screen and rechargeable battery pack make this device extremely portable and ideal for applying both fingerprint authentication (1:1 comparison) and fingerprint identification (1:N search) to applications (vaccination tracking, food and benefit distribution programs, human trafficking prevention) in rural communities, especially in developing countries. We also show that Match in Box is suited for capturing neonate fingerprints due to its high resolution (1900 ppi) cameras.

CVDec 26, 2017Code
Robust Minutiae Extractor: Integrating Deep Networks and Fingerprint Domain Knowledge

Dinh-Luan Nguyen, Kai Cao, Anil K. Jain

We propose a fully automatic minutiae extractor, called MinutiaeNet, based on deep neural networks with compact feature representation for fast comparison of minutiae sets. Specifically, first a network, called CoarseNet, estimates the minutiae score map and minutiae orientation based on convolutional neural network and fingerprint domain knowledge (enhanced image, orientation field, and segmentation map). Subsequently, another network, called FineNet, refines the candidate minutiae locations based on score map. We demonstrate the effectiveness of using the fingerprint domain knowledge together with the deep networks. Experimental results on both latent (NIST SD27) and plain (FVC 2004) public domain fingerprint datasets provide comprehensive empirical support for the merits of our method. Further, our method finds minutiae sets that are better in terms of precision and recall in comparison with state-of-the-art on these two datasets. Given the lack of annotated fingerprint datasets with minutiae ground truth, the proposed approach to robust minutiae detection will be useful to train network-based fingerprint matching algorithms as well as for evaluating fingerprint individuality at scale. MinutiaeNet is implemented in Tensorflow: https://github.com/luannd/MinutiaeNet

CVDec 26, 2017Code
RaspiReader: Open Source Fingerprint Reader

Joshua J. Engelsma, Kai Cao, Anil K. Jain

We open source an easy to assemble, spoof resistant, high resolution, optical fingerprint reader, called RaspiReader, using ubiquitous components. By using our open source STL files and software, RaspiReader can be built in under one hour for only US $175. As such, RaspiReader provides the fingerprint research community a seamless and simple method for quickly prototyping new ideas involving fingerprint reader hardware. In particular, we posit that this open source fingerprint reader will facilitate the exploration of novel fingerprint spoof detection techniques involving both hardware and software. We demonstrate one such spoof detection technique by specially customizing RaspiReader with two cameras for fingerprint image acquisition. One camera provides high contrast, frustrated total internal reflection (FTIR) fingerprint images, and the other outputs direct images of the finger in contact with the platen. Using both of these image streams, we extract complementary information which, when fused together and used for spoof detection, results in marked performance improvement over previous methods relying only on grayscale FTIR images provided by COTS optical readers. Finally, fingerprint matching experiments between images acquired from the FTIR output of RaspiReader and images acquired from a COTS reader verify the interoperability of the RaspiReader with existing COTS optical readers.

CVNov 10, 2017Code
Longitudinal Study of Child Face Recognition

Debayan Deb, Neeta Nain, Anil K. Jain

We present a longitudinal study of face recognition performance on Children Longitudinal Face (CLF) dataset containing 3,682 face images of 919 subjects, in the age group [2, 18] years. Each subject has at least four face images acquired over a time span of up to six years. Face comparison scores are obtained from (i) a state-of-the-art COTS matcher (COTS-A), (ii) an open-source matcher (FaceNet), and (iii) a simple sum fusion of scores obtained from COTS-A and FaceNet matchers. To improve the performance of the open-source FaceNet matcher for child face recognition, we were able to fine-tune it on an independent training set of 3,294 face images of 1,119 children in the age group [3, 18] years. Multilevel statistical models are fit to genuine comparison scores from the CLF dataset to determine the decrease in face recognition accuracy over time. Additionally, we analyze both the verification and open-set identification accuracies in order to evaluate state-of-the-art face recognition technology for tracing and identifying children lost at a young age as victims of child trafficking or abduction.

CVAug 25, 2017Code
RaspiReader: An Open Source Fingerprint Reader Facilitating Spoof Detection

Joshua J. Engelsma, Kai Cao, Anil K. Jain

We present the design and prototype of an open source, optical fingerprint reader, called RaspiReader, using ubiquitous components. RaspiReader, a low-cost and easy to assemble reader, provides the fingerprint research community a seamless and simple method for gaining more control over the sensing component of fingerprint recognition systems. In particular, we posit that this versatile fingerprint reader will encourage researchers to explore novel spoof detection methods that integrate both hardware and software. RaspiReader's hardware is customized with two cameras for fingerprint acquisition with one camera providing high contrast, frustrated total internal reflection (FTIR) images, and the other camera outputting direct images. Using both of these image streams, we extract complementary information which, when fused together, results in highly discriminative features for fingerprint spoof (presentation attack) detection. Our experimental results demonstrate a marked improvement over previous spoof detection methods which rely only on FTIR images provided by COTS optical readers. Finally, fingerprint matching experiments between images acquired from the FTIR output of the RaspiReader and images acquired from a COTS fingerprint reader verify the interoperability of the RaspiReader with existing COTS optical readers.

CVApr 21, 2024
Universal Fingerprint Generation: Controllable Diffusion Model with Multimodal Conditions

Steven A. Grosz, Anil K. Jain

The utilization of synthetic data for fingerprint recognition has garnered increased attention due to its potential to alleviate privacy concerns surrounding sensitive biometric data. However, current methods for generating fingerprints have limitations in creating impressions of the same finger with useful intra-class variations. To tackle this challenge, we present GenPrint, a framework to produce fingerprint images of various types while maintaining identity and offering humanly understandable control over different appearance factors such as fingerprint class, acquisition type, sensor device, and quality level. Unlike previous fingerprint generation approaches, GenPrint is not confined to replicating style characteristics from the training dataset alone: it enables the generation of novel styles from unseen devices without requiring additional fine-tuning. To accomplish these objectives, we developed GenPrint using latent diffusion models with multimodal conditions (text and image) for consistent generation of style and identity. Our experiments leverage a variety of publicly available datasets for training and evaluation. Results demonstrate the benefits of GenPrint in terms of identity preservation, explainable control, and universality of generated images. Importantly, the GenPrint-generated images yield comparable or even superior accuracy to models trained solely on real data and further enhances performance when augmenting the diversity of existing real fingerprint datasets.

CVJul 4, 2025
Foundation versus Domain-specific Models: Performance Comparison, Fusion, and Explainability in Face Recognition

Redwan Sony, Parisa Farmanifard, Arun Ross et al.

In this paper, we address the following question: How do generic foundation models (e.g., CLIP, BLIP, GPT-4o, Grok-4) compare against a domain-specific face recognition model (viz., AdaFace or ArcFace) on the face recognition task? Through a series of experiments involving several foundation models and benchmark datasets, we report the following findings: (a) In all face benchmark datasets considered, domain-specific models outperformed zero-shot foundation models. (b) The performance of zero-shot generic foundation models improved on over-segmented face images compared to tightly cropped faces, thereby suggesting the importance of contextual clues. (c) A simple score-level fusion of a foundation model with a domain-specific face recognition model improved the accuracy at low false match rates. (d) Foundation models, such as GPT-4o and Grok-4, are able to provide explainability to the face recognition pipeline. In some instances, foundation models are even able to resolve low-confidence decisions made by AdaFace, thereby reiterating the importance of combining domain-specific face recognition models with generic foundation models in a judicious manner.

CVMar 29, 2025
Optimal Transport-Guided Source-Free Adaptation for Face Anti-Spoofing

Zhuowei Li, Tianchen Zhao, Xiang Xu et al.

Developing a face anti-spoofing model that meets the security requirements of clients worldwide is challenging due to the domain gap between training datasets and diverse end-user test data. Moreover, for security and privacy reasons, it is undesirable for clients to share a large amount of their face data with service providers. In this work, we introduce a novel method in which the face anti-spoofing model can be adapted by the client itself to a target domain at test time using only a small sample of data while keeping model parameters and training data inaccessible to the client. Specifically, we develop a prototype-based base model and an optimal transport-guided adaptor that enables adaptation in either a lightweight training or training-free fashion, without updating base model's parameters. Furthermore, we propose geodesic mixup, an optimal transport-based synthesis method that generates augmented training data along the geodesic path between source prototypes and target data distribution. This allows training a lightweight classifier to effectively adapt to target-specific characteristics while retaining essential knowledge learned from the source domain. In cross-domain and cross-attack settings, compared with recent methods, our method achieves average relative improvements of 19.17% in HTER and 8.58% in AUC, respectively.

CVJun 1, 2024
GenPalm: Contactless Palmprint Generation with Diffusion Models

Steven A. Grosz, Anil K. Jain

The scarcity of large-scale palmprint databases poses a significant bottleneck to advancements in contactless palmprint recognition. To address this, researchers have turned to synthetic data generation. While Generative Adversarial Networks (GANs) have been widely used, they suffer from instability and mode collapse. Recently, diffusion probabilistic models have emerged as a promising alternative, offering stable training and better distribution coverage. This paper introduces a novel palmprint generation method using diffusion probabilistic models, develops an end-to-end framework for synthesizing multiple palm identities, and validates the realism and utility of the generated palmprints. Experimental results demonstrate the effectiveness of our approach in generating palmprint images which enhance contactless palmprint recognition performance across several test databases utilizing challenging cross-database and time-separated evaluation protocols.

CVJan 16, 2024
Mobile Contactless Palmprint Recognition: Use of Multiscale, Multimodel Embeddings

Steven A. Grosz, Akash Godbole, Anil K. Jain

Contactless palmprints are comprised of both global and local discriminative features. Most prior work focuses on extracting global features or local features alone for palmprint matching, whereas this research introduces a novel framework that combines global and local features for enhanced palmprint matching accuracy. Leveraging recent advancements in deep learning, this study integrates a vision transformer (ViT) and a convolutional neural network (CNN) to extract complementary local and global features. Next, a mobile-based, end-to-end palmprint recognition system is developed, referred to as Palm-ID. On top of the ViT and CNN features, Palm-ID incorporates a palmprint enhancement module and efficient dimensionality reduction (for faster matching). Palm-ID balances the trade-off between accuracy and latency, requiring just 18ms to extract a template of size 516 bytes, which can be efficiently searched against a 10,000 palmprint gallery in 0.33ms on an AMD EPYC 7543 32-Core CPU utilizing 128-threads. Cross-database matching protocols and evaluations on large-scale operational datasets demonstrate the robustness of the proposed method, achieving a TAR of 98.06% at FAR=0.01% on a newly collected, time-separated dataset. To show a practical deployment of the end-to-end system, the entire recognition pipeline is embedded within a mobile device for enhanced user privacy and security.

CVMay 31, 2023
A Universal Latent Fingerprint Enhancer Using Transformers

Andre Brasil Vieira Wyzykowski, Anil K. Jain

Forensic science heavily relies on analyzing latent fingerprints, which are crucial for criminal investigations. However, various challenges, such as background noise, overlapping prints, and contamination, make the identification process difficult. Moreover, limited access to real crime scene and laboratory-generated databases hinders the development of efficient recognition algorithms. This study aims to develop a fast method, which we call ULPrint, to enhance various latent fingerprint types, including those obtained from real crime scenes and laboratory-created samples, to boost fingerprint recognition system performance. In closed-set identification accuracy experiments, the enhanced image was able to improve the performance of the MSU-AFIS from 61.56\% to 75.19\% in the NIST SD27 database, from 67.63\% to 77.02\% in the MSP Latent database, and from 46.90\% to 52.12\% in the NIST SD302 database. Our contributions include (1) the development of a two-step latent fingerprint enhancement method that combines Ridge Segmentation with UNet and Mix Visual Transformer (MiT) SegFormer-B5 encoder architecture, (2) the implementation of multiple dilated convolutions in the UNet architecture to capture intricate, non-local patterns better and enhance ridge segmentation, and (3) the guided blending of the predicted ridge mask with the latent fingerprint. This novel approach, ULPrint, streamlines the enhancement process, addressing challenges across diverse latent fingerprint types to improve forensic investigations and criminal justice outcomes.

CVMay 12, 2023
ViT Unified: Joint Fingerprint Recognition and Presentation Attack Detection

Steven A. Grosz, Kanishka P. Wijewardena, Anil K. Jain

A secure fingerprint recognition system must contain both a presentation attack (i.e., spoof) detection and recognition module in order to protect users against unwanted access by malicious users. Traditionally, these tasks would be carried out by two independent systems; however, recent studies have demonstrated the potential to have one unified system architecture in order to reduce the computational burdens on the system, while maintaining high accuracy. In this work, we leverage a vision transformer architecture for joint spoof detection and matching and report competitive results with state-of-the-art (SOTA) models for both a sequential system (two ViT models operating independently) and a unified architecture (a single ViT model for both tasks). ViT models are particularly well suited for this task as the ViT's global embedding encodes features useful for recognition, whereas the individual, local embeddings are useful for spoof detection. We demonstrate the capability of our unified model to achieve an average integrated matching (IM) accuracy of 98.87% across LivDet 2013 and 2015 CrossMatch sensors. This is comparable to IM accuracy of 98.95% of our sequential dual-ViT system, but with ~50% of the parameters and ~58% of the latency.

CVMay 9, 2023
Child Palm-ID: Contactless Palmprint Recognition for Children

Akash Godbole, Steven A. Grosz, Anil K. Jain

Effective distribution of nutritional and healthcare aid for children, particularly infants and toddlers, in some of the least developed and most impoverished countries of the world, is a major problem due to the lack of reliable identification documents. Biometric authentication technology has been investigated to address child recognition in the absence of reliable ID documents. We present a mobile-based contactless palmprint recognition system, called Child Palm-ID, which meets the requirements of usability, hygiene, cost, and accuracy for child recognition. Using a contactless child palmprint database, Child-PalmDB1, consisting of 19,158 images from 1,020 unique palms (in the age range of 6 mos. to 48 mos.), we report a TAR=94.11% @ FAR=0.1%. The proposed Child Palm-ID system is also able to recognize adults, achieving a TAR=99.4% on the CASIA contactless palmprint database and a TAR=100% on the COEP contactless adult palmprint database, both @ FAR=0.1%. These accuracies are competitive with the SOTA provided by COTS systems. Despite these high accuracies, we show that the TAR for time-separated child-palmprints is only 78.1% @ FAR=0.1%.

CVJan 13, 2022
RealGait: Gait Recognition for Person Re-Identification

Shaoxiong Zhang, Yunhong Wang, Tianrui Chai et al.

Human gait is considered a unique biometric identifier which can be acquired in a covert manner at a distance. However, models trained on existing public domain gait datasets which are captured in controlled scenarios lead to drastic performance decline when applied to real-world unconstrained gait data. On the other hand, video person re-identification techniques have achieved promising performance on large-scale publicly available datasets. Given the diversity of clothing characteristics, clothing cue is not reliable for person recognition in general. So, it is actually not clear why the state-of-the-art person re-identification methods work as well as they do. In this paper, we construct a new gait dataset by extracting silhouettes from an existing video person re-identification challenge which consists of 1,404 persons walking in an unconstrained manner. Based on this dataset, a consistent and comparative study between gait recognition and person re-identification can be carried out. Given that our experimental results show that current gait recognition approaches designed under data collected in controlled scenarios are inappropriate for real surveillance scenarios, we propose a novel gait recognition method, called RealGait. Our results suggest that recognizing people by their gait in real surveillance scenarios is feasible and the underlying gait pattern is probably the true reason why video person re-idenfification works in practice.

CVJan 10, 2022
PrintsGAN: Synthetic Fingerprint Generator

Joshua J. Engelsma, Steven A. Grosz, Anil K. Jain

A major impediment to researchers working in the area of fingerprint recognition is the lack of publicly available, large-scale, fingerprint datasets. The publicly available datasets that do exist contain very few identities and impressions per finger. This limits research on a number of topics, including e.g., using deep networks to learn fixed length fingerprint embeddings. Therefore, we propose PrintsGAN, a synthetic fingerprint generator capable of generating unique fingerprints along with multiple impressions for a given fingerprint. Using PrintsGAN, we synthesize a database of 525k fingerprints (35K distinct fingers, each with 15 impressions). Next, we show the utility of the PrintsGAN generated dataset by training a deep network to extract a fixed-length embedding from a fingerprint. In particular, an embedding model trained on our synthetic fingerprints and fine-tuned on a small number of publicly available real fingerprints (25K prints from NIST SD302) obtains a TAR of 87.03% @ FAR=0.01% on the NIST SD4 database (a boost from TAR=73.37% when only trained on NIST SD302). Prevailing synthetic fingerprint generation methods do not enable such performance gains due to i) lack of realism or ii) inability to generate multiple impressions per finger. We plan to release our database of synthetic fingerprints to the public.

AIJul 12, 2021
Trustworthy AI: A Computational Perspective

Haochen Liu, Yiqi Wang, Wenqi Fan et al.

In the past few decades, artificial intelligence (AI) technology has experienced swift developments, changing everyone's daily life and profoundly altering the course of human society. The intention of developing AI is to benefit humans, by reducing human labor, bringing everyday convenience to human lives, and promoting social good. However, recent research and AI applications show that AI can cause unintentional harm to humans, such as making unreliable decisions in safety-critical scenarios or undermining fairness by inadvertently discriminating against one group. Thus, trustworthy AI has attracted immense attention recently, which requires careful consideration to avoid the adverse effects that AI may bring to humans, so that humans can fully trust and live in harmony with AI technologies. Recent years have witnessed a tremendous amount of research on trustworthy AI. In this survey, we present a comprehensive survey of trustworthy AI from a computational perspective, to help readers understand the latest technologies for achieving trustworthy AI. Trustworthy AI is a large and complex area, involving various dimensions. In this work, we focus on six of the most crucial dimensions in achieving trustworthy AI: (i) Safety & Robustness, (ii) Non-discrimination & Fairness, (iii) Explainability, (iv) Privacy, (v) Accountability & Auditability, and (vi) Environmental Well-Being. For each dimension, we review the recent related technologies according to a taxonomy and summarize their applications in real-world systems. We also discuss the accordant and conflicting interactions among different dimensions and discuss potential aspects for trustworthy AI to investigate in the future.

CVMay 14, 2021
Biometrics: Trust, but Verify

Anil K. Jain, Debayan Deb, Joshua J. Engelsma

Over the past two decades, biometric recognition has exploded into a plethora of different applications around the globe. This proliferation can be attributed to the high levels of authentication accuracy and user convenience that biometric recognition systems afford end-users. However, in-spite of the success of biometric recognition systems, there are a number of outstanding problems and concerns pertaining to the various sub-modules of biometric recognition systems that create an element of mistrust in their use - both by the scientific community and also the public at large. Some of these problems include: i) questions related to system recognition performance, ii) security (spoof attacks, adversarial attacks, template reconstruction attacks and demographic information leakage), iii) uncertainty over the bias and fairness of the systems to all users, iv) explainability of the seemingly black-box decisions made by most recognition systems, and v) concerns over data centralization and user privacy. In this paper, we provide an overview of each of the aforementioned open-ended challenges. We survey work that has been conducted to address each of these concerns and highlight the issues requiring further attention. Finally, we provide insights into how the biometric community can address core biometric recognition systems design issues to better instill trust, fairness, and security for all.

CVApr 7, 2021
FedFace: Collaborative Learning of Face Recognition Model

Divyansh Aggarwal, Jiayu Zhou, Anil K. Jain

DNN-based face recognition models require large centrally aggregated face datasets for training. However, due to the growing data privacy concerns and legal restrictions, accessing and sharing face datasets has become exceedingly difficult. We propose FedFace, a federated learning (FL) framework for collaborative learning of face recognition models in a privacy-aware manner. FedFace utilizes the face images available on multiple clients to learn an accurate and generalizable face recognition model where the face images stored at each client are neither shared with other clients nor the central host and each client is a mobile device containing face images pertaining to only the owner of the device (one identity per client). Our experiments show the effectiveness of FedFace in enhancing the verification performance of pre-trained face recognition system on standard face verification benchmarks namely LFW, IJB-A, and IJB-C.

CVApr 6, 2021
C2CL: Contact to Contactless Fingerprint Matching

Steven A. Grosz, Joshua J. Engelsma, Eryun Liu et al.

Matching contactless fingerprints or finger photos to contact-based fingerprint impressions has received increased attention in the wake of COVID-19 due to the superior hygiene of the contactless acquisition and the widespread availability of low cost mobile phones capable of capturing photos of fingerprints with sufficient resolution for verification purposes. This paper presents an end-to-end automated system, called C2CL, comprised of a mobile finger photo capture app, preprocessing, and matching algorithms to handle the challenges inhibiting previous cross-matching methods; namely i) low ridge-valley contrast of contactless fingerprints, ii) varying roll, pitch, yaw, and distance of the finger to the camera, iii) non-linear distortion of contact-based fingerprints, and vi) different image qualities of smartphone cameras. Our preprocessing algorithm segments, enhances, scales, and unwarps contactless fingerprints, while our matching algorithm extracts both minutiae and texture representations. A sequestered dataset of 9,888 contactless 2D fingerprints and corresponding contact-based fingerprints from 206 subjects (2 thumbs and 2 index fingers for each subject) acquired using our mobile capture app is used to evaluate the cross-database performance of our proposed algorithm. Furthermore, additional experimental results on 3 publicly available datasets show substantial improvement in the state-of-the-art for contact to contactless fingerprint matching (TAR in the range of 96.67% to 98.30% at FAR=0.01%).

CVApr 5, 2021
Unified Detection of Digital and Physical Face Attacks

Debayan Deb, Xiaoming Liu, Anil K. Jain

State-of-the-art defense mechanisms against face attacks achieve near perfect accuracies within one of three attack categories, namely adversarial, digital manipulation, or physical spoofs, however, they fail to generalize well when tested across all three categories. Poor generalization can be attributed to learning incoherent attacks jointly. To overcome this shortcoming, we propose a unified attack detection framework, namely UniFAD, that can automatically cluster 25 coherent attack types belonging to the three categories. Using a multi-task learning framework along with k-means clustering, UniFAD learns joint representations for coherent attacks, while uncorrelated attack types are learned separately. Proposed UniFAD outperforms prevailing defense methods and their fusion with an overall TDR = 94.73% @ 0.2% FDR on a large fake face dataset consisting of 341K bona fide images and 448K attack images of 25 types across all 3 categories. Proposed method can detect an attack within 3 milliseconds on a Nvidia 2080Ti. UniFAD can also identify the attack types and categories with 75.81% and 97.37% accuracies, respectively.

CVNov 28, 2020
FaceGuard: A Self-Supervised Defense Against Adversarial Face Images

Debayan Deb, Xiaoming Liu, Anil K. Jain

Prevailing defense mechanisms against adversarial face images tend to overfit to the adversarial perturbations in the training set and fail to generalize to unseen adversarial attacks. We propose a new self-supervised adversarial defense framework, namely FaceGuard, that can automatically detect, localize, and purify a wide variety of adversarial faces without utilizing pre-computed adversarial training samples. During training, FaceGuard automatically synthesizes challenging and diverse adversarial attacks, enabling a classifier to learn to distinguish them from real faces and a purifier attempts to remove the adversarial perturbations in the image space. Experimental results on LFW dataset show that FaceGuard can achieve 99.81% detection accuracy on six unseen adversarial attack types. In addition, the proposed method can enhance the face recognition performance of ArcFace from 34.27% TAR @ 0.1% FAR under no defense to 77.46% TAR @ 0.1% FAR.

CVNov 26, 2020
Lifting 2D StyleGAN for 3D-Aware Face Generation

Yichun Shi, Divyansh Aggarwal, Anil K. Jain

We propose a framework, called LiftedGAN, that disentangles and lifts a pre-trained StyleGAN2 for 3D-aware face generation. Our model is "3D-aware" in the sense that it is able to (1) disentangle the latent space of StyleGAN2 into texture, shape, viewpoint, lighting and (2) generate 3D components for rendering synthetic images. Unlike most previous methods, our method is completely self-supervised, i.e. it neither requires any manual annotation nor 3DMM model for training. Instead, it learns to generate images as well as their 3D components by distilling the prior knowledge in StyleGAN2 with a differentiable renderer. The proposed model is able to output both the 3D shape and texture, allowing explicit pose and lighting control over generated images. Qualitative and quantitative results show the superiority of our approach over existing methods on 3D-controllable GANs in content controllability while generating realistic high quality images.

LGOct 13, 2020
To be Robust or to be Fair: Towards Fairness in Adversarial Training

Han Xu, Xiaorui Liu, Yaxin Li et al.

Adversarial training algorithms have been proved to be reliable to improve machine learning models' robustness against adversarial examples. However, we find that adversarial training algorithms tend to introduce severe disparity of accuracy and robustness between different groups of data. For instance, a PGD adversarially trained ResNet18 model on CIFAR-10 has 93% clean accuracy and 67% PGD l-infty-8 robust accuracy on the class "automobile" but only 65% and 17% on the class "cat". This phenomenon happens in balanced datasets and does not exist in naturally trained models when only using clean samples. In this work, we empirically and theoretically show that this phenomenon can happen under general adversarial training algorithms which minimize DNN models' robust errors. Motivated by these findings, we propose a Fair-Robust-Learning (FRL) framework to mitigate this unfairness problem when doing adversarial defenses. Experimental results validate the effectiveness of FRL.

CVOct 7, 2020
Infant-ID: Fingerprints for Global Good

Joshua J. Engelsma, Debayan Deb, Kai Cao et al.

In many of the least developed and developing countries, a multitude of infants continue to suffer and die from vaccine-preventable diseases and malnutrition. Lamentably, the lack of official identification documentation makes it exceedingly difficult to track which infants have been vaccinated and which infants have received nutritional supplements. Answering these questions could prevent this infant suffering and premature death around the world. To that end, we propose Infant-Prints, an end-to-end, low-cost, infant fingerprint recognition system. Infant-Prints is comprised of our (i) custom built, compact, low-cost (85 USD), high-resolution (1,900 ppi), ergonomic fingerprint reader, and (ii) high-resolution infant fingerprint matcher. To evaluate the efficacy of Infant-Prints, we collected a longitudinal infant fingerprint database captured in 4 different sessions over a 12-month time span (December 2018 to January 2020), from 315 infants at the Saran Ashram Hospital, a charitable hospital in Dayalbagh, Agra, India. Our experimental results demonstrate, for the first time, that Infant-Prints can deliver accurate and reliable recognition (over time) of infants enrolled between the ages of 2-3 months, in time for effective delivery of vaccinations, healthcare, and nutritional supplements (TAR=95.2% @ FAR = 1.0% for infants aged 8-16 weeks at enrollment and authenticated 3 months later).

LGSep 16, 2020
Transfer Learning in Deep Reinforcement Learning: A Survey

Zhuangdi Zhu, Kaixiang Lin, Anil K. Jain et al.

Reinforcement learning is a learning paradigm for solving sequential decision-making problems. Recent years have witnessed remarkable progress in reinforcement learning upon the fast development of deep neural networks. Along with the promising prospects of reinforcement learning in numerous domains such as robotics and game-playing, transfer learning has arisen to tackle various challenges faced by reinforcement learning, by transferring knowledge from external expertise to facilitate the efficiency and effectiveness of the learning process. In this survey, we systematically investigate the recent progress of transfer learning approaches in the context of deep reinforcement learning. Specifically, we provide a framework for categorizing the state-of-the-art transfer learning approaches, under which we analyze their goals, methodologies, compatible reinforcement learning backbones, and practical applications. We also draw connections between transfer learning and other relevant topics from the reinforcement learning perspective and explore their potential challenges that await future research progress.

CVAug 1, 2020
White-Box Evaluation of Fingerprint Recognition Systems

Steven A. Grosz, Joshua J. Engelsma, Anil K. Jain

Typical evaluations of fingerprint recognition systems consist of end-to-end black-box evaluations, which assess performance in terms of overall identification or authentication accuracy. However, these black-box tests of system performance do not reveal insights into the performance of the individual modules, including image acquisition, feature extraction, and matching. On the other hand, white-box evaluations, the topic of this paper, measure the individual performance of each constituent module in isolation. While a few studies have conducted white-box evaluations of the fingerprint reader, feature extractor, and matching components, no existing study has provided a full system, white-box analysis of the uncertainty introduced at each stage of a fingerprint recognition system. In this work, we extend previous white-box evaluations of fingerprint recognition system components and provide a unified, in-depth analysis of fingerprint recognition system performance based on the aggregated white-box evaluation results. In particular, we analyze the uncertainty introduced at each stage of the fingerprint recognition system due to adverse capture conditions (i.e., varying illumination, moisture, and pressure) at the time of acquisition. Our experiments show that a system that performs better overall, in terms of black-box recognition performance, does not necessarily perform best at each module in the fingerprint recognition system pipeline, which can only be seen with white-box analysis of each sub-module. Findings such as these enable researchers to better focus their efforts in improving fingerprint recognition systems.

CVJun 13, 2020
Mitigating Face Recognition Bias via Group Adaptive Classifier

Sixue Gong, Xiaoming Liu, Anil K. Jain

Face recognition is known to exhibit bias - subjects in a certain demographic group can be better recognized than other groups. This work aims to learn a fair face representation, where faces of every group could be more equally represented. Our proposed group adaptive classifier mitigates bias by using adaptive convolution kernels and attention mechanisms on faces based on their demographic attributes. The adaptive module comprises kernel masks and channel-wise attention maps for each demographic group so as to activate different facial regions for identification, leading to more discriminative features pertinent to their demographics. Our introduced automated adaptation strategy determines whether to apply adaptation to a certain layer by iteratively computing the dissimilarity among demographic-adaptive parameters. A new de-biasing loss function is proposed to mitigate the gap of average intra-class distance between demographic groups. Experiments on face benchmarks (RFW, LFW, IJB-A, and IJB-C) show that our work is able to mitigate face recognition bias across demographic groups while maintaining the competitive accuracy.

CVJun 4, 2020
Look Locally Infer Globally: A Generalizable Face Anti-Spoofing Approach

Debayan Deb, Anil K. Jain

State-of-the-art spoof detection methods tend to overfit to the spoof types seen during training and fail to generalize to unknown spoof types. Given that face anti-spoofing is inherently a local task, we propose a face anti-spoofing framework, namely Self-Supervised Regional Fully Convolutional Network (SSR-FCN), that is trained to learn local discriminative cues from a face image in a self-supervised manner. The proposed framework improves generalizability while maintaining the computational efficiency of holistic face anti-spoofing approaches (< 4 ms on a Nvidia GTX 1080Ti GPU). The proposed method is interpretable since it localizes which parts of the face are labeled as spoofs. Experimental results show that SSR-FCN can achieve TDR = 65% @ 2.0% FDR when evaluated on a dataset comprising of 13 different spoof types under unknown attacks while achieving competitive performances under standard benchmark datasets (Oulu-NPU, CASIA-MFSD, and Replay-Attack).

CRApr 8, 2020
DashCam Pay: A System for In-vehicle Payments Using Face and Voice

Cori Tymoszek, Sunpreet S. Arora, Kim Wagner et al.

We present our ongoing work on developing a system, called DashCam Pay, that enables in-vehicle payments in a seamless and secure manner using face and voice biometrics. A plug-and-play device (dashcam) mounted in the vehicle is used to capture face images and voice commands of passengers. Privacy-preserving biometric comparison techniques are used to compare the biometric data captured by the dashcam with the biometric data enrolled on the users' mobile devices over a wireless interface (e.g., Bluetooth or Wi-Fi Direct) to determine the payer. Once the payer is identified, payment is conducted using the enrolled payment credential on the mobile device of the payer. We conduct preliminary analysis on data collected using a commercially available dashcam to show the feasibility of building the proposed system. A prototype of the proposed system is also developed in Android. DashCam Pay can be integrated as a software solution by dashcam or vehicle manufacturers to enable open loop in-vehicle payments.