Josef Bigün

h-index40

36papers

6,652citations

Novelty34%

AI Score27

Ranked #156,695 of 194,257 authors (top 81%)#50,850 in CV (top 86%)

36 Papers

2.4CVJul 16

Explicit Over Implicit: Enhancing CNNs Via Complex Structure Tensor Representations for Periocular Recognition

Kevin Hernandez-Diaz, Josef Bigun, Fernando Alonso-Fernandez

Our study provides evidence that CNNs struggle to extract orientation features effectively. We show that using the Complex Structure Tensor, which contains compact orientation features with certainties, as input to CNNs consistently improves identification accuracy compared to grayscale inputs alone. Experiments also demonstrated that our inputs, provided by mini-complex convnets, combined with reduced CNN sizes, outperformed full-fledged, prevailing CNN architectures. This suggests that the upfront use of orientation features in CNNs, a strategy seen in mammalian vision, not only mitigates their limitations but also enhances their explainability and relevance to thin-clients. Experiments were conducted on publicly available datasets comprising periocular images (Cross-Eyed and PolyU) for biometric identification and verification in both Close-World and Open-World Scenarios using six CNN architectures. Our experiments on the Cross-Eyed and PolyU datasets yield a 5-26% reduction in EER, providing strong empirical evidence that explicit orientation priors mitigate CNN representational limits in Open-World and Close-World scenarios.

2.6CVMar 11, 2022Code

LFW-Beautified: A Dataset of Face Images with Beautification and Augmented Reality Filters

Pontus Hedman, Vasilios Skepetzis, Kevin Hernandez-Diaz et al.

Selfie images enjoy huge popularity in social media. The same platforms centered around sharing this type of images offer filters to beautify them or incorporate augmented reality effects. Studies suggests that filtered images attract more views and engagement. Selfie images are also in increasing use in security applications due to mobiles becoming data hubs for many transactions. Also, video conference applications, boomed during the pandemic, include such filters. Such filters may destroy biometric features that would allow person recognition or even detection of the face itself, even if such commodity applications are not necessarily used to compromise facial systems. This could also affect subsequent investigations like crimes in social media, where automatic analysis is usually necessary given the amount of information posted in social sites or stored in devices or cloud repositories. To help in counteracting such issues, we contribute with a database of facial images that includes several manipulations. It includes image enhancement filters (which mostly modify contrast and lightning) and augmented reality filters that incorporate items like animal noses or glasses. Additionally, images with sunglasses are processed with a reconstruction network trained to learn to reverse such modifications. This is because obfuscating the eye region has been observed in the literature to have the highest impact on the accuracy of face detection or recognition. We start from the popular Labeled Faces in the Wild (LFW) database, to which we apply different modifications, generating 8 datasets. Each dataset contains 4,324 images of size 64 x 64, with a total of 34,592 images. The use of a public and widely employed face dataset allows for replication and comparison. The created database is available at https://github.com/HalmstadUniversityBiometrics/LFW-Beautified

6.5CVMar 27, 2022

A Survey of Super-Resolution in Iris Biometrics with Evaluation of Dictionary-Learning

F. Alonso-Fernandez, R. A. Farrugia, J. Bigun et al.

The lack of resolution has a negative impact on the performance of image-based biometrics. While many generic super-resolution methods have been proposed to restore low-resolution images, they usually aim to enhance their visual appearance. However, a visual enhancement of biometric images does not necessarily correlate with a better recognition performance. Reconstruction approaches need thus to incorporate specific information from the target biometric modality to effectively improve recognition. This paper presents a comprehensive survey of iris super-resolution approaches proposed in the literature. We have also adapted an Eigen-patches reconstruction method based on PCA Eigen-transformation of local image patches. The structure of the iris is exploited by building a patch-position dependent dictionary. In addition, image patches are restored separately, having their own reconstruction weights. This allows the solution to be locally optimized, helping to preserve local information. To evaluate the algorithm, we degraded high-resolution images from the CASIA Interval V3 database. Different restorations were considered, with 15x15 pixels being the smallest resolution. To the best of our knowledge, this is among the smallest resolutions employed in the literature. The framework is complemented with six public iris comparators, which were used to carry out biometric verification and identification experiments. Experimental results show that the proposed method significantly outperforms both bilinear and bicubic interpolation at very low-resolution. The performance of a number of comparators attains an impressive Equal Error Rate as low as 5%, and a Top-1 accuracy of 77-84% when considering iris images of only 15x15 pixels. These results clearly demonstrate the benefit of using trained super-resolution techniques to improve the quality of iris images prior to matching.

9.1CVAug 8, 2023Code

EFaR 2023: Efficient Face Recognition Competition

Jan Niklas Kolf, Fadi Boutros, Jurek Elliesen et al.

This paper presents the summary of the Efficient Face Recognition Competition (EFaR) held at the 2023 International Joint Conference on Biometrics (IJCB 2023). The competition received 17 submissions from 6 different teams. To drive further development of efficient face recognition models, the submitted solutions are ranked based on a weighted score of the achieved verification accuracies on a diverse set of benchmarks, as well as the deployability given by the number of floating-point operations and model size. The evaluation of submissions is extended to bias, cross-quality, and large-scale recognition benchmarks. Overall, the paper gives an overview of the achieved performance values of the submitted solutions as well as a diverse set of baselines. The submitted solutions use small, efficient network architectures to reduce the computational cost, some solutions apply model quantization. An outlook on possible techniques that are underrepresented in current solutions is given as well.

13.6CVNov 10, 2022

Near-infrared and visible-light periocular recognition with Gabor features using frequency-adaptive automatic eye detection

Fernando Alonso-Fernandez, Josef Bigun

Periocular recognition has gained attention recently due to demands of increased robustness of face or iris in less controlled scenarios. We present a new system for eye detection based on complex symmetry filters, which has the advantage of not needing training. Also, separability of the filters allows faster detection via one-dimensional convolutions. This system is used as input to a periocular algorithm based on retinotopic sampling grids and Gabor spectrum decomposition. The evaluation framework is composed of six databases acquired both with near-infrared and visible sensors. The experimental setup is complemented with four iris matchers, used for fusion experiments. The eye detection system presented shows very high accuracy with near-infrared data, and a reasonable good accuracy with one visible database. Regarding the periocular system, it exhibits great robustness to small errors in locating the eye centre, as well as to scale changes of the input image. The density of the sampling grid can also be reduced without sacrificing accuracy. Lastly, despite the poorer performance of the iris matchers with visible data, fusion with the periocular system can provide an improvement of more than 20%. The six databases used have been manually annotated, with the annotation made publicly available.

3.7CVDec 4, 2022

Combining multiple matchers for fingerprint verification: A case study in biosecure network of excellence

Fernando Alonso-Fernandez, Julian Fierrez-Aguilar, Hartwig Fronthaler et al.

We report on experiments for the fingerprint modality conducted during the First BioSecure Residential Workshop. Two reference systems for fingerprint verification have been tested together with two additional non-reference systems. These systems follow different approaches of fingerprint processing and are discussed in detail. Fusion experiments I volving different combinations of the available systems are presented. The experimental results show that the best recognition strategy involves both minutiae-based and correlation-based measurements. Regarding the fusion experiments, the best relative improvement is obtained when fusing systems that are based on heterogeneous strategies for feature extraction and/or matching. The best combinations of two/three/four systems always include the best individual systems whereas the best verification performance is obtained when combining all the available systems.

8.1CVDec 28, 2022

Periocular Biometrics: A Modality for Unconstrained Scenarios

Fernando Alonso-Fernandez, Josef Bigun, Julian Fierrez et al.

Periocular refers to the externally visible region of the face that surrounds the eye socket. This feature-rich area can provide accurate identification in unconstrained or uncooperative scenarios, where the iris or face modalities may not offer sufficient biometric cues due to factors such as partial occlusion or high subject-to-camera distance. The COVID-19 pandemic has further highlighted its importance, as the ocular region remained the only visible facial area even in controlled settings due to the widespread use of masks. This paper discusses the state of the art in periocular biometrics, presenting an overall framework encompassing its most significant research aspects, which include: (a) ocular definition, acquisition, and detection; (b) identity recognition, including combination with other modalities and use of various spectra; and (c) ocular soft-biometric analysis. Finally, we conclude by addressing current challenges and proposing future directions.

5.7CVNov 10, 2022

Experimental analysis regarding the influence of iris segmentation on the recognition rate

Heinz Hofbauer, Fernando Alonso-Fernandez, Josef Bigun et al.

In this study the authors will look at the detection and segmentation of the iris and its influence on the overall performance of the iris-biometric tool chain. The authors will examine whether the segmentation accuracy, based on conformance with a ground truth, can serve as a predictor for the overall performance of the iris-biometric tool chain. That is: If the segmentation accuracy is improved will this always improve the overall performance? Furthermore, the authors will systematically evaluate the influence of segmentation parameters, pupillary and limbic boundary and normalisation centre (based on Daugman's rubbersheet model), on the rest of the iris-biometric tool chain. The authors will investigate if accurately finding these parameters is important and how consistency, that is, extracting the same exact region of the iris during segmenting, influences the overall performance.

2.8CVNov 2, 2023

Log-Likelihood Score Level Fusion for Improved Cross-Sensor Smartphone Periocular Recognition

Fernando Alonso-Fernandez, Kiran B. Raja, Christoph Busch et al.

The proliferation of cameras and personal devices results in a wide variability of imaging conditions, producing large intra-class variations and a significant performance drop when images from heterogeneous environments are compared. However, many applications require to deal with data from different sources regularly, thus needing to overcome these interoperability problems. Here, we employ fusion of several comparators to improve periocular performance when images from different smartphones are compared. We use a probabilistic fusion framework based on linear logistic regression, in which fused scores tend to be log-likelihood ratios, obtaining a reduction in cross-sensor EER of up to 40% due to the fusion. Our framework also provides an elegant and simple solution to handle signals from different devices, since same-sensor and cross-sensor score distributions are aligned and mapped to a common probabilistic domain. This allows the use of Bayes thresholds for optimal decision-making, eliminating the need of sensor-specific thresholds, which is essential in operational conditions because the threshold setting critically determines the accuracy of the authentication process in many applications.

3.7CVOct 25, 2022

Real-time AdaBoost cascade face tracker based on likelihood map and optical flow

Andreas Ranftl, Fernando Alonso-Fernandez, Stefan Karlsson et al.

The authors present a novel face tracking approach where optical flow information is incorporated into a modified version of the Viola Jones detection algorithm. In the original algorithm, detection is static, as information from previous frames is not considered. In addition, candidate windows have to pass all stages of the classification cascade, otherwise they are discarded as containing no face. In contrast, the proposed tracker preserves information about the number of classification stages passed by each window. Such information is used to build a likelihood map, which represents the probability of having a face located at that position. Tracking capabilities are provided by extrapolating the position of the likelihood map to the next frame by optical flow computation. The proposed algorithm works in real time on a standard laptop. The system is verified on the Boston Head Tracking Database, showing that the proposed algorithm outperforms the standard Viola Jones detector in terms of detection rate and stability of the output bounding box, as well as including the capability to deal with occlusions. The authors also evaluate two recently published face detectors based on convolutional networks and deformable part models with their algorithm showing a comparable accuracy at a fraction of the computation time.

1.4CVDec 9, 2022

Visual Detection of Personal Protective Equipment and Safety Gear on Industry Workers

Jonathan Karlsson, Fredrik Strand, Josef Bigun et al.

Workplace injuries are common in today's society due to a lack of adequately worn safety equipment. A system that only admits appropriately equipped personnel can be created to improve working conditions. The goal is thus to develop a system that will improve workers' safety using a camera that will detect the usage of Personal Protective Equipment (PPE). To this end, we collected and labeled appropriate data from several public sources, which have been used to train and evaluate several models based on the popular YOLOv4 object detector. Our focus, driven by a collaborating industrial partner, is to implement our system into an entry control point where workers must present themselves to obtain access to a restricted area. Combined with facial identity recognition, the system would ensure that only authorized people wearing appropriate equipment are granted access. A novelty of this work is that we increase the number of classes to five objects (hardhat, safety vest, safety gloves, safety glasses, and hearing protection), whereas most existing works only focus on one or two classes, usually hardhats or vests. The AI model developed provides good detection accuracy at a distance of 3 and 5 meters in the collaborative environment where we aim at operating (mAP of 99/89%, respectively). The small size of some objects or the potential occlusion by body parts have been identified as potential factors that are detrimental to accuracy, which we have counteracted via data augmentation and cropping of the body before applying PPE detection.

3.7CVOct 18, 2022

Very Low-Resolution Iris Recognition Via Eigen-Patch Super-Resolution and Matcher Fusion

Fernando Alonso-Fernandez, Reuben A. Farrugia, Josef Bigun

Current research in iris recognition is moving towards enabling more relaxed acquisition conditions. This has effects on the quality of acquired images, with low resolution being a predominant issue. Here, we evaluate a super-resolution algorithm used to reconstruct iris images based on Eigen-transformation of local image patches. Each patch is reconstructed separately, allowing better quality of enhanced images by preserving local information. Contrast enhancement is used to improve the reconstruction quality, while matcher fusion has been adopted to improve iris recognition performance. We validate the system using a database of 1,872 near-infrared iris images. The presented approach is superior to bilinear or bicubic interpolation, especially at lower resolutions, and the fusion of the two systems pushes the EER to below 5% for down-sampling factors up to a image size of only 13x13.

1.5CVJul 26, 2023

Periocular biometrics: databases, algorithms and directions

Fernando Alonso-Fernandez, Josef Bigun

Periocular biometrics has been established as an independent modality due to concerns on the performance of iris or face systems in uncontrolled conditions. Periocular refers to the facial region in the eye vicinity, including eyelids, lashes and eyebrows. It is available over a wide range of acquisition distances, representing a trade-off between the whole face (which can be occluded at close distances) and the iris texture (which do not have enough resolution at long distances). Since the periocular region appears in face or iris images, it can be used also in conjunction with these modalities. Features extracted from the periocular region have been also used successfully for gender classification and ethnicity classification, and to study the impact of gender transformation or plastic surgery in the recognition performance. This paper presents a review of the state of the art in periocular biometric research, providing an insight of the most relevant issues and giving a thorough coverage of the existing literature. Future research trends are also briefly discussed.

3.9CVJul 25, 2023

An Explainable Model-Agnostic Algorithm for CNN-based Biometrics Verification

Fernando Alonso-Fernandez, Kevin Hernandez-Diaz, Jose M. Buades et al.

This paper describes an adaptation of the Local Interpretable Model-Agnostic Explanations (LIME) AI method to operate under a biometric verification setting. LIME was initially proposed for networks with the same output classes used for training, and it employs the softmax probability to determine which regions of the image contribute the most to classification. However, in a verification setting, the classes to be recognized have not been seen during training. In addition, instead of using the softmax output, face descriptors are usually obtained from a layer before the classification layer. The model is adapted to achieve explainability via cosine similarity between feature vectors of perturbated versions of the input image. The method is showcased for face biometrics with two CNN models based on MobileNetv2 and ResNet50.

3.9CVJul 11, 2023

One-Shot Learning for Periocular Recognition: Exploring the Effect of Domain Adaptation and Data Bias on Deep Representations

Kevin Hernandez-Diaz, Fernando Alonso-Fernandez, Josef Bigun

One weakness of machine-learning algorithms is the need to train the models for a new task. This presents a specific challenge for biometric recognition due to the dynamic nature of databases and, in some instances, the reliance on subject collaboration for data collection. In this paper, we investigate the behavior of deep representations in widely used CNN models under extreme data scarcity for One-Shot periocular recognition, a biometric recognition task. We analyze the outputs of CNN layers as identity-representing feature vectors. We examine the impact of Domain Adaptation on the network layers' output for unseen data and evaluate the method's robustness concerning data normalization and generalization of the best-performing layer. We improved state-of-the-art results that made use of networks trained with biometric datasets with millions of images and fine-tuned for the target periocular dataset by utilizing out-of-the-box CNNs trained for the ImageNet Recognition Challenge and standard computer vision algorithms. For example, for the Cross-Eyed dataset, we could reduce the EER by 67% and 79% (from 1.70% and 3.41% to 0.56% and 0.71%) in the Close-World and Open-World protocols, respectively, for the periocular case. We also demonstrate that traditional algorithms like SIFT can outperform CNNs in situations with limited data or scenarios where the network has not been trained with the test classes like the Open-World mode. SIFT alone was able to reduce the EER by 64% and 71.6% (from 1.7% and 3.41% to 0.6% and 0.97%) for Cross-Eyed in the Close-World and Open-World protocols, respectively, and a reduction of 4.6% (from 3.94% to 3.76%) in the PolyU database for the Open-World and single biometric case.

3.7CVOct 18, 2022

Compact multi-scale periocular recognition using SAFE features

Fernando Alonso-Fernandez, Anna Mikaelyan, Josef Bigun

In this paper, we present a new approach for periocular recognition based on the Symmetry Assessment by Feature Expansion (SAFE) descriptor, which encodes the presence of various symmetric curve families around image key points. We use the sclera center as single key point for feature extraction, highlighting the object-like identity properties that concentrates to this unique point of the eye. As it is demonstrated, such discriminative properties can be encoded with a reduced set of symmetric curves. Experiments are done with a database of periocular images captured with a digital camera. We test our system against reference periocular features, achieving top performance with a considerably smaller feature vector (given by the use of a single key point). All the systems tested also show a nearly steady correlation between acquisition distance and performance, and they are also able to cope well when enrolment and test images are not captured at the same distance. Fusion experiments among the available systems are also provided.

6.5CVJul 28, 2024

Combined CNN and ViT features off-the-shelf: Another astounding baseline for recognition

Fernando Alonso-Fernandez, Kevin Hernandez-Diaz, Prayag Tiwari et al.

We apply pre-trained architectures, originally developed for the ImageNet Large Scale Visual Recognition Challenge, for periocular recognition. These architectures have demonstrated significant success in various computer vision tasks beyond the ones for which they were designed. This work builds on our previous study using off-the-shelf Convolutional Neural Network (CNN) and extends it to include the more recently proposed Vision Transformers (ViT). Despite being trained for generic object classification, middle-layer features from CNNs and ViTs are a suitable way to recognize individuals based on periocular images. We also demonstrate that CNNs and ViTs are highly complementary since their combination results in boosted accuracy. In addition, we show that a small portion of these pre-trained models can achieve good accuracy, resulting in thinner models with fewer parameters, suitable for resource-limited environments such as mobiles. This efficiency improves if traditional handcrafted features are added as well.

3.9CVJul 20, 2023

SqueezerFaceNet: Reducing a Small Face Recognition CNN Even More Via Filter Pruning

Fernando Alonso-Fernandez, Kevin Hernandez-Diaz, Jose Maria Buades Rubio et al.

The widespread use of mobile devices for various digital services has created a need for reliable and real-time person authentication. In this context, facial recognition technologies have emerged as a dependable method for verifying users due to the prevalence of cameras in mobile devices and their integration into everyday applications. The rapid advancement of deep Convolutional Neural Networks (CNNs) has led to numerous face verification architectures. However, these models are often large and impractical for mobile applications, reaching sizes of hundreds of megabytes with millions of parameters. We address this issue by developing SqueezerFaceNet, a light face recognition network which less than 1M parameters. This is achieved by applying a network pruning method based on Taylor scores, where filters with small importance scores are removed iteratively. Starting from an already small network (of 1.24M) based on SqueezeNet, we show that it can be further reduced (up to 40%) without an appreciable loss in performance. To the best of our knowledge, we are the first to evaluate network pruning methods for the task of face recognition.

1.5CVNov 3, 2023

Keypoint Description by Symmetry Assessment -- Applications in Biometrics

Anna Mikaelyan, Fernando Alonso-Fernandez, Josef Bigun

We present a model-based feature extractor to describe neighborhoods around keypoints by finite expansion, estimating the spatially varying orientation by harmonic functions. The iso-curves of such functions are highly symmetric w.r.t. the origin (a keypoint) and the estimated parameters have well defined geometric interpretations. The origin is also a unique singularity of all harmonic functions, helping to determine the location of a keypoint precisely, whereas the functions describe the object shape of the neighborhood. This is novel and complementary to traditional texture features which describe texture-shape properties i.e. they are purposively invariant to translation (within a texture). We report on experiments of verification and identification of keypoints in forensic fingerprints by using publicly available data (NIST SD27) and discuss the results in comparison to other studies. These support our conclusions that the novel features can equip single cores or single minutia with a significant verification power at 19% EER, and an identification power of 24-78% for ranks of 1-20. Additionally, we report verification results of periocular biometrics using near-infrared images, reaching an EER performance of 13%, which is comparable to the state of the art. More importantly, fusion of two systems, our and texture features (Gabor), result in a measurable performance improvement. We report reduction of the EER to 9%, supporting the view that the novel features capture relevant visual information, which traditional texture features do not.

2.0CVApr 24, 2024

Understanding and Improving CNNs with Complex Structure Tensor: A Biometrics Study

Kevin Hernandez-Diaz, Josef Bigun, Fernando Alonso-Fernandez

Our study provides evidence that CNNs struggle to effectively extract orientation features. We show that the use of Complex Structure Tensor, which contains compact orientation features with certainties, as input to CNNs consistently improves identification accuracy compared to using grayscale inputs alone. Experiments also demonstrated that our inputs, which were provided by mini complex conv-nets, combined with reduced CNN sizes, outperformed full-fledged, prevailing CNN architectures. This suggests that the upfront use of orientation features in CNNs, a strategy seen in mammalian vision, not only mitigates their limitations but also enhances their explainability and relevance to thin-clients. Experiments were done on publicly available data sets comprising periocular images for biometric identification and verification (Close and Open World) using 6 State of the Art CNN architectures. We reduced SOA Equal Error Rate (EER) on the PolyU dataset by 5-26% depending on data and scenario.

3.7CVNov 24, 2022

Fingerprint Image-Quality Estimation and its Application to Multialgorithm Verification

Hartwig Fronthaler, Klaus Kollreider, Josef Bigun et al.

Signal-quality awareness has been found to increase recognition rates and to support decisions in multisensor environments significantly. Nevertheless, automatic quality assessment is still an open issue. Here, we study the orientation tensor of fingerprint images to quantify signal impairments, such as noise, lack of structure, blur, with the help of symmetry descriptors. A strongly reduced reference is especially favorable in biometrics, but less information is not sufficient for the approach. This is also supported by numerous experiments involving a simpler quality estimator, a trained method (NFIQ), as well as the human perception of fingerprint quality on several public databases. Furthermore, quality measurements are extensively reused to adapt fusion parameters in a monomodal multialgorithm fingerprint recognition environment. In this study, several trained and nontrained score-level fusion schemes are investigated. A Bayes-based strategy for incorporating experts past performances and current quality conditions, a novel cascaded scheme for computational efficiency, besides simple fusion rules, is presented. The quantitative results favor quality awareness under all aspects, boosting recognition rates and fusing differently skilled experts efficiently as well as effectively (by training).

1.4CVJan 26, 2022

Continuous Examination by Automatic Quiz Assessment Using Spiral Codes and Image Processing

Fernando Alonso-Fernandez, Josef Bigun

We describe a technical solution implemented at Halmstad University to automatise assessment and reporting of results of paper-based quiz exams. Paper quizzes are affordable and within reach of campus education in classrooms. Offering and taking them is accepted as they cause fewer issues with reliability and democratic access, e.g. a large number of students can take them without a trusted mobile device, internet, or battery. By contrast, correction of the quiz is a considerable obstacle. We suggest mitigating the issue by a novel image processing technique using harmonic spirals that aligns answer sheets in sub-pixel accuracy to read student identity and answers and to email results within minutes, all fully automatically. Using the described method, we carry out regular weekly examinations in two master courses at the mentioned centre without a significant workload increase. The employed solution also enables us to assign a unique identifier to each quiz (e.g. week 1, week 2. . . ) while allowing us to have an individualised quiz for each student.

1.4CVJan 25, 2022

Writer Recognition Using Off-line Handwritten Single Block Characters

Adrian Leo Hagström, Rustam Stanikzai, Josef Bigun et al.

Block characters are often used when filling paper forms for a variety of purposes. We investigate if there is biometric information contained within individual digits of handwritten text. In particular, we use personal identity numbers consisting of the six digits of the date of birth, DoB. We evaluate two recognition approaches, one based on handcrafted features that compute contour directional measurements, and another based on deep features from a ResNet50 model. We use a self-captured database of 317 individuals and 4920 written DoBs in total. Results show the presence of identity-related information in a piece of handwritten information as small as six digits with the DoB. We also analyze the impact of the amount of enrolment samples, varying its number between one and ten. Results with such small amount of data are promising. With ten enrolment samples, the Top-1 accuracy with deep features is around 94%, and reaches nearly 100% by Top-10. The verification accuracy is more modest, with EER>20%with any given feature and enrolment set size, showing that there is still room for improvement.

11.6CVNov 14, 2021

A Comparative Study of Fingerprint Image-Quality Estimation Methods

Fernando Alonso-Fernandez, Julian Fierrez, Javier Ortega-Garcia et al.

One of the open issues in fingerprint verification is the lack of robustness against image-quality degradation. Poor-quality images result in spurious and missing features, thus degrading the performance of the overall system. Therefore, it is important for a fingerprint recognition system to estimate the quality and validity of the captured fingerprint images. In this work, we review existing approaches for fingerprint image-quality estimation, including the rationale behind the published measures and visual examples showing their behavior under different quality conditions. We have also tested a selection of fingerprint image-quality estimation algorithms. For the experiments, we employ the BioSec multimodal baseline corpus, which includes 19200 fingerprint images from 200 individuals acquired in two sessions with three different sensors. The behavior of the selected quality measures is compared, showing high correlation between them in most cases. The effect of low-quality samples in the verification performance is also studied for a widely available minutiae-based fingerprint matching system.

8.0CVOct 17, 2021

On the Effect of Selfie Beautification Filters on Face Detection and Recognition

Pontus Hedman, Vasilios Skepetzis, Kevin Hernandez-Diaz et al.

Beautification and augmented reality filters are very popular in applications that use selfie images captured with smartphones or personal devices. However, they can distort or modify biometric features, severely affecting the capability of recognizing individuals' identity or even detecting the face. Accordingly, we address the effect of such filters on the accuracy of automated face detection and recognition. The social media image filters studied either modify the image contrast or illumination or occlude parts of the face with for example artificial glasses or animal noses. We observe that the effect of some of these filters is harmful both to face detection and identity recognition, specially if they obfuscate the eye or (to a lesser extent) the nose. To counteract such effect, we develop a method to reconstruct the applied manipulation with a modified version of the U-NET segmentation network. This is observed to contribute to a better face detection and recognition accuracy. From a recognition perspective, we employ distance measures and trained machine learning algorithms applied to features extracted using a ResNet-34 network trained to recognize faces. We also evaluate if incorporating filtered images to the training set of machine learning approaches are beneficial for identity recognition. Our results show good recognition when filters do not occlude important landmarks, specially the eyes (identification accuracy >99%, EER<2%). The combined effect of the proposed approaches also allow to mitigate the effect produced by filters that occlude parts of the face, achieving an identification accuracy of >92% with the majority of perturbations evaluated, and an EER <8%. Although there is room for improvement, when neither U-NET reconstruction nor training with filtered images is applied, the accuracy with filters that severely occlude the eye is <72% (identification) and >12% (EER)

6.5CVMar 31, 2021

Facial Masks and Soft-Biometrics: Leveraging Face Recognition CNNs for Age and Gender Prediction on Mobile Ocular Images

Fernando Alonso-Fernandez, Kevin Hernandez Diaz, Silvia Ramis et al.

We address the use of selfie ocular images captured with smartphones to estimate age and gender. Partial face occlusion has become an issue due to the mandatory use of face masks. Also, the use of mobile devices has exploded, with the pandemic further accelerating the migration to digital services. However, state-of-the-art solutions in related tasks such as identity or expression recognition employ large Convolutional Neural Networks, whose use in mobile devices is infeasible due to hardware limitations and size restrictions of downloadable applications. To counteract this, we adapt two existing lightweight CNNs proposed in the context of the ImageNet Challenge, and two additional architectures proposed for mobile face recognition. Since datasets for soft-biometrics prediction using selfie images are limited, we counteract over-fitting by using networks pre-trained on ImageNet. Furthermore, some networks are further pre-trained for face recognition, for which very large training databases are available. Since both tasks employ similar input data, we hypothesize that such strategy can be beneficial for soft-biometrics estimation. A comprehensive study of the effects of different pre-training over the employed architectures is carried out, showing that, in most cases, a better accuracy is obtained after the networks have been fine-tuned for face recognition.

4.2CVAug 26, 2020

Cross-Spectral Periocular Recognition with Conditional Adversarial Networks

Kevin Hernandez-Diaz, Fernando Alonso-Fernandez, Josef Bigun

This work addresses the challenge of comparing periocular images captured in different spectra, which is known to produce significant drops in performance in comparison to operating in the same spectrum. We propose the use of Conditional Generative Adversarial Networks, trained to con-vert periocular images between visible and near-infrared spectra, so that biometric verification is carried out in the same spectrum. The proposed setup allows the use of existing feature methods typically optimized to operate in a single spectrum. Recognition experiments are done using a number of off-the-shelf periocular comparators based both on hand-crafted features and CNN descriptors. Using the Hong Kong Polytechnic University Cross-Spectral Iris Images Database (PolyU) as benchmark dataset, our experiments show that cross-spectral performance is substantially improved if both images are converted to the same spectrum, in comparison to matching features extracted from images in different spectra. In addition to this, we fine-tune a CNN based on the ResNet50 architecture, obtaining a cross-spectral periocular performance of EER=1%, and GAR>99% @ FAR=1%, which is comparable to the state-of-the-art with the PolyU database.

0.5CLJul 31, 2020

Writer Identification Using Microblogging Texts for Social Media Forensics

Fernando Alonso-Fernandez, Nicole Mariah Sharon Belvisi, Kevin Hernandez-Diaz et al.

Establishing authorship of online texts is fundamental to combat cybercrimes. Unfortunately, text length is limited on some platforms, making the challenge harder. We aim at identifying the authorship of Twitter messages limited to 140 characters. We evaluate popular stylometric features, widely used in literary analysis, and specific Twitter features like URLs, hashtags, replies or quotes. We use two databases with 93 and 3957 authors, respectively. We test varying sized author sets and varying amounts of training/test texts per author. Performance is further improved by feature combination via automatic selection. With a large number of training Tweets (>500), a good accuracy (Rank-5>80%) is achievable with only a few dozens of test Tweets, even with several thousands of authors. With smaller sample sizes (10-20 training Tweets), the search space can be diminished by 9-15% while keeping a high chance that the correct author is retrieved among the candidates. In such cases, automatic attribution can provide significant time savings to experts in suspect search. For completeness, we report verification results. With few training/test Tweets, the EER is above 20-25%, which is reduced to < 15% if hundreds of training Tweets are available. We also quantify the computational complexity and time permanence of the employed features.

5.0CVJul 16, 2020

SqueezeFacePoseNet: Lightweight Face Verification Across Different Poses for Mobile Platforms

Fernando Alonso-Fernandez, Javier Barrachina, Kevin Hernandez-Diaz et al.

Virtual applications through mobile platforms are one of the most critical and ever-growing fields in AI, where ubiquitous and real-time person authentication has become critical after the breakthrough of all services provided via mobile devices. In this context, face verification technologies can provide reliable and robust user authentication, given the availability of cameras in these devices, as well as their widespread use in everyday applications. The rapid development of deep Convolutional Neural Networks has resulted in many accurate face verification architectures. However, their typical size (hundreds of megabytes) makes them infeasible to be incorporated in downloadable mobile applications where the entire file typically may not exceed 100 Mb. Accordingly, we address the challenge of developing a lightweight face recognition network of just a few megabytes that can operate with sufficient accuracy in comparison to much larger models. The network also should be able to operate under different poses, given the variability naturally observed in uncontrolled environments where mobile devices are typically used. In this paper, we adapt the lightweight SqueezeNet model, of just 4.4MB, to effectively provide cross-pose face recognition. After trained on the MS-Celeb-1M and VGGFace2 databases, our model achieves an EER of 1.23% on the difficult frontal vs. profile comparison, and0.54% on profile vs. profile images. Under less extreme variations involving frontal images in any of the enrolment/query images pair, EER is pushed down to<0.3%, and the FRR at FAR=0.1%to less than 1%. This makes our light model suitable for face recognition where at least acquisition of the enrolment image can be controlled. At the cost of a slight degradation in performance, we also test an even lighter model (of just 2.5MB) where regular convolutions are replaced with depth-wise separable convolutions.

1.2CVMay 16, 2020

Total Least Square Optimal Analytic Signal by Structure Tensor for N-D images

Josef Bigun, Fernando Alonso-Fernandez

We produce the analytic signal by using the Structure Tensor, which provides Total Least Squares optimal vectors for estimating orientation and scale locally. Together, these vectors represent N-D frequency components that determine adaptive, complex probing filters. The N-D analytic signal is obtained through scalar products of adaptive filters with image neighborhoods. It comprises orientation, scale, phase, and amplitude information of the neighborhood. The ST analytic signal $ f_A $ is continuous and isotropic, and its extension to N-D is straightforward. The phase gradient can be represented as a vector (instantaneous frequency) or as a tensor. Both are continuous and isotropic, while the tensor additionally preserves continuity of orientation and retains the same information as the vector representation. The tensor representation can also be used to detect singularities. Detection with known phase portraits has been demonstrated in 2-D with relevance to fringe pattern processing in wave physics, including optics and fingerprint measurements. To construct adaptive filters we have used Gabor filter family members as probing functions, but other function families can also be used to sample the spectrum, e.g., quadrature filters. A comparison to three baseline alternatives-in representation (Monogenic signal), enhancement (Monogenic signal combined with a spline-wavelet pyramid), and singularity detection (mindtct, a fingerprint minutia detector widely used in numerous studies)-is also reported using images with precisely known ground truths for location, orientation, singularity type (where applicable), and wave period.

3.3CVFeb 14, 2020

Spectrum Translation for Cross-Spectral Ocular Matching

Kevin Hernandez Diaz, Fernando Alonso-Fernandez, Josef Bigun

Cross-spectral verification remains a big issue in biometrics, especially for the ocular area due to differences in the reflected features in the images depending on the region and spectrum used. In this paper, we investigate the use of Conditional Adversarial Networks for spectrum translation between near infra-red and visual light images for ocular biometrics. We analyze the transformation based on the overall visual quality of the transformed images and the accuracy drop of the identification system when trained with opposing data. We use the PolyU database and propose two different systems for biometric verification, the first one based on Siamese Networks trained with Softmax and Cross-Entropy loss, and the second one a Triplet Loss network. We achieved an EER of 1\% when using a Triplet Loss network trained for NIR and finding the Euclidean distance between the real NIR images and the fake ones translated from the visible spectrum. We also outperform previous results using baseline algorithms.

8.5CVFeb 21, 2019

Cross-Sensor Periocular Biometrics in a Global Pandemic: Comparative Benchmark and Novel Multialgorithmic Approach

Fernando Alonso-Fernandez, Kiran B. Raja, R. Raghavendra et al.

The massive availability of cameras results in a wide variability of imaging conditions, producing large intra-class variations and a significant performance drop if heterogeneous images are compared for person recognition. However, as biometrics is deployed, it is common to replace damaged or obsolete hardware, or to exchange information between heterogeneous applications. Variations in spectral bands can also occur. For example, surveillance face images (typically acquired in the visible spectrum, VIS) may need to be compared against a legacy iris database (typically acquired in near-infrared, NIR). Here, we propose a multialgorithmic approach to cope with periocular images from different sensors. With face masks in the front line against COVID-19, periocular recognition is regaining popularity since it is the only face region that remains visible. We integrate different comparators with a fusion scheme based on linear logistic regression, in which scores are represented by log-likelihood ratios. This allows easy interpretation of scores and the use of Bayes thresholds for optimal decision-making since scores from different comparators are in the same probabilistic range. We evaluate our approach in the context of the Cross-Eyed Competition, whose aim was to compare recognition approaches when NIR and VIS periocular images are matched. Our approach achieves EER=0.2% and FRR of just 0.47% at FAR=0.01%, representing the best overall approach of the competition. Experiments are also reported with a database of VIS images from different smartphones. We also discuss the impact of template size and computation times, with the most computationally heavy comparator playing an important role in the results. Lastly, the proposed method is shown to outperform other popular fusion approaches, such as the average of scores, SVMs or Random Forest.

1.7CVOct 23, 2018

Improving Automated Latent Fingerprint Identification using Extended Minutia Types

Ram P. Krish, Julian Fierrez, Daniel Ramos et al.

Latent fingerprints are usually processed with Automated Fingerprint Identification Systems (AFIS) by law enforcement agencies to narrow down possible suspects from a criminal database. AFIS do not commonly use all discriminatory features available in fingerprints but typically use only some types of features automatically extracted by a feature extraction algorithm. In this work, we explore ways to improve rank identification accuracies of AFIS when only a partial latent fingerprint is available. Towards solving this challenge, we propose a method that exploits extended fingerprint features (unusual/rare minutiae) not commonly considered in AFIS. This new method can be combined with any existing minutiae-based matcher. We first compute a similarity score based on least squares between latent and tenprint minutiae points, with rare minutiae features as reference points. Then the similarity score of the reference minutiae-based matcher at hand is modified based on a fitting error from the least square similarity stage. We use a realistic forensic fingerprint casework database in our experiments which contains rare minutiae features obtained from Guardia Civil, the Spanish law enforcement agency. Experiments are conducted using three minutiae-based matchers as a reference, namely: NIST-Bozorth3, VeriFinger-SDK and MCC-SDK. We report significant improvements in the rank identification accuracies when these minutiae matchers are augmented with our proposed algorithm based on rare minutiae features.

1.7CVOct 23, 2018

Expression Recognition Using the Periocular Region: A Feasibility Study

Fernando Alonso-Fernandez, Josef Bigun, Cristofer Englund

This paper investigates the feasibility of using the periocular region for expression recognition. Most works have tried to solve this by analyzing the whole face. Periocular is the facial region in the immediate vicinity of the eye. It has the advantage of being available over a wide range of distances and under partial face occlusion, thus making it suitable for unconstrained or uncooperative scenarios. We evaluate five different image descriptors on a dataset of 1,574 images from 118 subjects. The experimental results show an average/overall accuracy of 67.0%/78.0% by fusion of several descriptors. While this accuracy is still behind that attained with full-face methods, it is noteworthy to mention that our initial approach employs only one frame to predict the expression, in contraposition to state of the art, exploiting several order more data comprising spatial-temporal data which is often not available.

14.1CVOct 8, 2018

A Survey on Periocular Biometrics Research

Fernando Alonso-Fernandez, Josef Bigun

Periocular refers to the facial region in the vicinity of the eye, including eyelids, lashes and eyebrows. While face and irises have been extensively studied, the periocular region has emerged as a promising trait for unconstrained biometrics, following demands for increased robustness of face or iris systems. With a surprisingly high discrimination ability, this region can be easily obtained with existing setups for face and iris, and the requirement of user cooperation can be relaxed, thus facilitating the interaction with biometric systems. It is also available over a wide range of distances even when the iris texture cannot be reliably obtained (low resolution) or under partial face occlusion (close distances). Here, we review the state of the art in periocular biometrics research. A number of aspects are described, including: i) existing databases, ii) algorithms for periocular detection and/or segmentation, iii) features employed for recognition, iv) identification of the most discriminative regions of the periocular area, v) comparison with iris and face modalities, vi) soft-biometrics (gender/ethnicity classification), and vii) impact of gender transformation and plastic surgery on the recognition accuracy. This work is expected to provide an insight of the most relevant issues in periocular biometrics, giving a comprehensive coverage of the existing literature and current state of the art.

8.7CVSep 17, 2018

Periocular Recognition Using CNN Features Off-the-Shelf

Kevin Hernandez-Diaz, Fernando Alonso-Fernandez, Josef Bigun

Periocular refers to the region around the eye, including sclera, eyelids, lashes, brows and skin. With a surprisingly high discrimination ability, it is the ocular modality requiring the least constrained acquisition. Here, we apply existing pre-trained architectures, proposed in the context of the ImageNet Large Scale Visual Recognition Challenge, to the task of periocular recognition. These have proven to be very successful for many other computer vision tasks apart from the detection and classification tasks for which they were designed. Experiments are done with a database of periocular images captured with a digital camera. We demonstrate that these off-the-shelf CNN features can effectively recognize individuals based on periocular images, despite being trained to classify generic objects. Compared against reference periocular features, they show an EER reduction of up to ~40%, with the fusion of CNN and traditional features providing additional improvements.