2.1HCMay 7
Enhancing Eye Movement Biometrics for User Authentication via Continuous Gaze Offset Score FusionHashim Aziz, Mehedi Hasan Raju, Oleg V. Komogortsev
Eye movement biometrics (EMB) use subject-specific gaze dynamics for user authentication and identification. Recent deep learning-based EMB systems achieve strong performance by modeling temporal eye movement behavior. However, these systems typically overlook continuous gaze offset, despite prior evidence that it contains user-discriminative information. This work examines whether continuous gaze offset can improve biometric performance when combined with existing biometric features. We evaluate linear and nonlinear fusion methods on two publicly available datasets, collected via the lab-grade eye tracker and virtual reality headset across multiple tasks and observation durations. Results indicate that fusion offers performance benefits on both datasets, particularly when using nonlinear fusion. Additionally, fusing biometric information across multiple tasks further improves authentication performance. These findings support the hypothesis that continuous gaze offset may serve as useful auxiliary information under conditions of degraded or noisy eye tracking.
CVMay 22, 2025
Ocular Authentication: Fusion of Gaze and Periocular ModalitiesDillon Lohr, Michael J. Proulx, Mehedi Hasan Raju et al.
This paper investigates the feasibility of fusing two eye-centric authentication modalities-eye movements and periocular images-within a calibration-free authentication system. While each modality has independently shown promise for user authentication, their combination within a unified gaze-estimation pipeline has not been thoroughly explored at scale. In this report, we propose a multimodal authentication system and evaluate it using a large-scale in-house dataset comprising 9202 subjects with an eye tracking (ET) signal quality equivalent to a consumer-facing virtual reality (VR) device. Our results show that the multimodal approach consistently outperforms both unimodal systems across all scenarios, surpassing the FIDO benchmark. The integration of a state-of-the-art machine learning architecture contributed significantly to the overall authentication performance at scale, driven by the model's ability to capture authentication representations and the complementary discriminative characteristics of the fused modalities.
HCNov 14, 2021
An Assessment of the Eye Tracking Signal Quality Captured in the HoloLens 2Samantha D. Aziz, Oleg V. Komogortsev
We present an analysis of the eye tracking signal quality of the HoloLens 2s integrated eye tracker. Signal quality was measured from eye movement data captured during a random saccades task from a new eye movement dataset collected on 30 healthy adults. We characterize the eye tracking signal quality of the device in terms of spatial accuracy, spatial precision, temporal precision, linearity, and crosstalk. Most notably, our evaluation of spatial accuracy reveals that the eye movement data in our dataset appears to be uncalibrated. Recalibrating the data using a subset of our dataset task produces notably better eye tracking signal quality.
CVMay 24, 2020
Benefits of temporal information for appearance-based gaze estimationCristina Palmero, Oleg V. Komogortsev, Sachin S. Talathi
State-of-the-art appearance-based gaze estimation methods, usually based on deep learning techniques, mainly rely on static features. However, temporal trace of eye gaze contains useful information for estimating a given gaze point. For example, approaches leveraging sequential eye gaze information when applied to remote or low-resolution image scenarios with off-the-shelf cameras are showing promising results. The magnitude of contribution from temporal gaze trace is yet unclear for higher resolution/frame rate imaging systems, in which more detailed information about an eye is captured. In this paper, we investigate whether temporal sequences of eye images, captured using a high-resolution, high-frame rate head-mounted virtual reality system, can be leveraged to enhance the accuracy of an end-to-end appearance-based deep-learning model for gaze estimation. Performance is compared against a static-only version of the model. Results demonstrate statistically-significant benefits of temporal information, particularly for the vertical component of gaze.
CVMay 8, 2020
OpenEDS2020: Open Eyes DatasetCristina Palmero, Abhishek Sharma, Karsten Behrendt et al.
We present the second edition of OpenEDS dataset, OpenEDS2020, a novel dataset of eye-image sequences captured at a frame rate of 100 Hz under controlled illumination, using a virtual-reality head-mounted display mounted with two synchronized eye-facing cameras. The dataset, which is anonymized to remove any personally identifiable information on participants, consists of 80 participants of varied appearance performing several gaze-elicited tasks, and is divided in two subsets: 1) Gaze Prediction Dataset, with up to 66,560 sequences containing 550,400 eye-images and respective gaze vectors, created to foster research in spatio-temporal gaze estimation and prediction approaches; and 2) Eye Segmentation Dataset, consisting of 200 sequences sampled at 5 Hz, with up to 29,500 images, of which 5% contain a semantic segmentation label, devised to encourage the use of temporal information to propagate labels to contiguous frames. Baseline experiments have been evaluated on OpenEDS2020, one for each task, with average angular error of 5.37 degrees when performing gaze prediction on 1 to 5 frames into the future, and a mean intersection over union score of 84.1% for semantic segmentation. As its predecessor, OpenEDS dataset, we anticipate that this new dataset will continue creating opportunities to researchers in eye tracking, machine learning and computer vision communities, to advance the state of the art for virtual reality applications. The dataset is available for download upon request at http://research.fb.com/programs/openeds-2020-challenge/.
CRJan 24, 2020
Why Temporal Persistence of Biometric Features is so Valuable for Classification PerformanceLee Friedman, Hal Stern, Larry R. Price et al.
It is generally accepted that relatively more permanent (i.e., more temporally persistent) traits are more valuable for biometric performance than less permanent traits. Although this finding is intuitive, there is no current work identifying exactly where in the biometric analysis temporal persistence makes a difference. In this paper, we answer this question. In a recent report, we introduced the intraclass correlation coefficient (ICC) as an index of temporal persistence for such features. In that report, we also showed that choosing only the most temporally persistent features yielded superior performance in 12 of 14 datasets. Motivated by those empirical results, we present a novel approach using synthetic features to study which aspects of a biometric identification study are influenced by the temporal persistence of features. What we show is that using more temporally persistent features produces effects on the similarity score distributions that explain why this quality is so key to biometric performance. The results identified with the synthetic data are largely reinforced by an analysis of two datasets, one based on eye-movements and one based on gait. There was one difference between the synthetic and real data: In real data, features are intercorrelated, with the level of intercorrelation increasing with increasing ICC. This increasedhttps://www.overleaf.com/project/5e2b14694c5dc600017292e6 intercorrelation in real data was associated with an increase in the spread of the impostor similarity score distributions. Removing these intercorrelations for real datasets with a decorrelation step produced results which were very similar to that obtained with synthetic features.
HCDec 4, 2019
Evaluating the Data Quality of Eye Tracking Signals from a Virtual Reality System: Case Study using SMI's Eye-Tracking HTC ViveDillon J. Lohr, Lee Friedman, Oleg V. Komogortsev
We evaluated the data quality of SMI's tethered eye-tracking head-mounted display based on the HTC Vive (ET-HMD) during a random saccade task. We measured spatial accuracy, spatial precision, temporal precision, linearity, and crosstalk. We proposed the use of a non-parametric spatial precision measure based on the median absolute deviation (MAD). Our linearity analysis considered both the slope and adjusted R-squared of a best-fitting line. We were the first to test for a quadratic component to crosstalk. We prepended a calibration task to the random saccade task and evaluated 2 methods to employ this user-supplied calibration. For this, we used a unique binning approach to choose samples to be included in the recalibration analyses. We compared our quality measures between the ET-HMD and our EyeLink 1000 (SR-Research, Ottawa, Ontario, CA). We found that the ET-HMD had significantly better spatial accuracy and linearity fit than our EyeLink, but both devices had similar spatial precision and linearity slope. We also found that, while the EyeLink had no significant crosstalk, the ET-HMD generally exhibited quadratic crosstalk. Fourier analysis revealed that the binocular signal was a low-pass filtered version of the monocular signal. Such filtering resulted in the binocular signal being useless for the study of high-frequency components such as saccade dynamics.
CVSep 4, 2019
Assessment of Shift-Invariant CNN Gaze Mappings for PS-OG Eye Movement SensorsHenry K. Griffith, Dmytro Katrychuk, Oleg V. Komogortsev
Photosensor oculography (PS-OG) eye movement sensors offer desirable performance characteristics for integration within wireless head mounted devices (HMDs), including low power consumption and high sampling rates. To address the known performance degradation of these sensors due to HMD shifts, various machine learning techniques have been proposed for mapping sensor outputs to gaze location. This paper advances the understanding of a recently introduced convolutional neural network designed to provide shift invariant gaze mapping within a specified range of sensor translations. Performance is assessed for shift training examples which better reflect the distribution of values that would be generated through manual repositioning of the HMD during a dedicated collection of training data. The network is shown to exhibit comparable accuracy for this realistic shift distribution versus a previously considered rectangular grid, thereby enhancing the feasibility of in-field set-up. In addition, this work further demonstrates the practical viability of the proposed initialization process by demonstrating robust mapping performance versus training data scale. The ability to maintain reasonable accuracy for shifts extending beyond those introduced during training is also demonstrated.
CRJun 14, 2019
Biometric Performance as a Function of Gallery SizeLee Friedman, Hal S Stern, Vladyslav Prokopenko et al.
Many developers of biometric systems start with modest samples before general deployment. They are interested in how their systems will work with much larger samples. We evaluated the effect of gallery size on biometric performance. Identification rates describe the performance of biometric identification, whereas ROC-based measures describe the performance of biometric authentication (verification). Therefore, we examined how increases in gallery size affected identification rates (i.e., Rank-1 Identification Rate, or Rank-1 IR) and ROC-based measures such as equal error rate (EER). We studied these phenomena with synthetic data as well as real data from a face recognition study. It is well known that the Rank-1 IR declines with increasing gallery size. We have provided further insight into this decline. We have shown that this relationship is linear in log(Gallery Size). We have also shown that this decline can be counteracted with the inclusion of additional information (features) for larger gallery sizes. We have also described the curves which can be used to predict how much additional information is required to stabilize the Rank-1 IR as a function of gallery size. These equations are also linear in log(gallery size). We have also shown that the entire ROC curve is not systematically affected by gallery size, and so ROC-based scalar performance metrics such as EER are also stable across gallery size.
CRJun 14, 2019
The Linear Relationship between Temporal Persistence, Number of Independent Features and Target EERLee Friedman, Hal S. Stern, Oleg V. Komogortsev
If you have a target level of biometric performance (e.g. EER = 5% or 0.1%), how many units of unique information (uncorrelated features) are needed to achieve that target? We show, for normally distributed features, that the answer to that question depends on the temporal persistence of the feature set. We address these questions with synthetic features introduced in a prior report. We measure temporal persistence with an intraclass correlation coefficient (ICC). For 5 separate EER targets (5.0%, 2.0%, 1.0%, 0.5% and 0.1%) we provide linear relationships between the temporal persistence of the feature set and the log10(number of features). These linear relationships will help those in the planning stage, prior to setting up a new biometric system, determine the required temporal persistence and number of independent features needed to achieve certain EER targets.
CVSep 8, 2017
Method to Detect Eye Position Noise from Video-Oculography when Detection of Pupil or Corneal Reflection Position FailsEvgeny Abdulin, Lee Friedman, Oleg V. Komogortsev
We present software to detect noise in eye position signals from video-based eye-tracking systems that depend on accurate pupil and corneal reflection position estimation. When such systems transiently fail to properly detect the pupil or the corneal reflection due to occlusion from eyelids, eye lashes or various shadows, the estimated gaze position is false. This produces an artifactual signal in the position trace that is rapidly, irregularly oscillating between true and false gaze positions. We refer to this noise as RIONEPS (Rapid Irregularly Oscillating Noise of the Eye Position Signal). Our method for detecting these periods automatically is based on an estimate of the relative inefficiency of the eye position signal. We look for RIONEPS in the horizontal and vertical traces separately, and although we typically use it offline, it is suitable to adaptation for real time use. This method requires a threshold to be set, and although we provide some guidance, thresholds will have to be estimated empirically.
CVJul 17, 2017
Photosensor Oculography: Survey and Parametric Analysis of Designs using Model-Based SimulationIoannis Rigas, Hayes Raffle, Oleg V. Komogortsev
This paper presents a renewed overview of photosensor oculography (PSOG), an eye-tracking technique based on the principle of using simple photosensors to measure the amount of reflected (usually infrared) light when the eye rotates. Photosensor oculography can provide measurements with high precision, low latency and reduced power consumption, and thus it appears as an attractive option for performing eye-tracking in the emerging head-mounted interaction devices, e.g. augmented and virtual reality (AR/VR) headsets. In our current work we employ an adjustable simulation framework as a common basis for performing an exploratory study of the eye-tracking behavior of different photosensor oculography designs. With the performed experiments we explore the effects from the variation of some basic parameters of the designs on the resulting accuracy and cross-talk, which are crucial characteristics for the seamless operation of human-computer interaction applications based on eye-tracking. Our experimental results reveal the design trade-offs that need to be adopted to tackle the competing conditions that lead to optimum performance of different eye-tracking characteristics. We also present the transformations that arise in the eye-tracking output when sensor shifts occur, and assess the resulting degradation in accuracy for different combinations of eye movements and sensor shifts.
CVJul 17, 2017
Hybrid PS-V Technique: A Novel Sensor Fusion Approach for Fast Mobile Eye-Tracking with Sensor-Shift Aware CorrectionIoannis Rigas, Hayes Raffle, Oleg V. Komogortsev
This paper introduces and evaluates a hybrid technique that fuses efficiently the eye-tracking principles of photosensor oculography (PSOG) and video oculography (VOG). The main concept of this novel approach is to use a few fast and power-economic photosensors as the core mechanism for performing high speed eye-tracking, whereas in parallel, use a video sensor operating at low sampling-rate (snapshot mode) to perform dead-reckoning error correction when sensor movements occur. In order to evaluate the proposed method, we simulate the functional components of the technique and present our results in experimental scenarios involving various combinations of horizontal and vertical eye and sensor movements. Our evaluation shows that the developed technique can be used to provide robustness to sensor shifts that otherwise could induce error larger than 5 deg. Our analysis suggests that the technique can potentially enable high speed eye-tracking at low power profiles, making it suitable to be used in emerging head-mounted devices, e.g. AR/VR headsets.
QMSep 13, 2016
Method to Assess the Temporal Persistence of Potential Biometric Features: Application to Oculomotor, and Gait-Related DatabasesLee Friedman, Ioannis Rigas, Mark S. Nixon et al.
Although temporal persistence, or permanence, is a well understood requirement for optimal biometric features, there is no general agreement on how to assess temporal persistence. We suggest that the best way to assess temporal persistence is to perform a test-retest study, and assess test-retest reliability. For ratio-scale features that are normally distributed, this is best done using the Intraclass Correlation Coefficient (ICC). For 10 distinct data sets (8 eye-movement related, and 2 gait related), we calculated the test-retest reliability ('Temporal persistence') of each feature, and compared biometric performance of high-ICC features to lower ICC features, and to the set of all features. We demonstrate that using a subset of only high-ICC features produced superior Rank-1-Identification Rate (Rank-1-IR) performance in 9 of 10 databases (p = 0.01, one-tailed). For Equal Error Rate (EER), using a subset of only high-ICC features produced superior performance in 8 of 10 databases (p = 0.055, one-tailed). In general, then, prescreening potential biometric features, and choosing only highly reliable features will yield better performance than lower ICC features or than the set of all features combined. We hypothesize that this would likely be the case for any biometric modality where the features can be expressed as quantitative values on an interval or ratio scale, assuming an adequate number of relatively independent features.