Zhe Jin

CV
h-index13
25papers
405citations
Novelty50%
AI Score56

25 Papers

IVMar 8, 2022
Abandoning the Bayer-Filter to See in the Dark

Xingbo Dong, Wanyan Xu, Zhihui Miao et al.

Low-light image enhancement - a pervasive but challenging problem, plays a central role in enhancing the visibility of an image captured in a poor illumination environment. Due to the fact that not all photons can pass the Bayer-Filter on the sensor of the color camera, in this work, we first present a De-Bayer-Filter simulator based on deep neural networks to generate a monochrome raw image from the colored raw image. Next, a fully convolutional network is proposed to achieve the low-light image enhancement by fusing colored raw data with synthesized monochrome raw data. Channel-wise attention is also introduced to the fusion process to establish a complementary interaction between features from colored and monochrome raw images. To train the convolutional networks, we propose a dataset with monochrome and color raw pairs named Mono-Colored Raw paired dataset (MCR) collected by using a monochrome camera without Bayer-Filter and a color camera with Bayer-Filter. The proposed pipeline take advantages of the fusion of the virtual monochrome and the color raw images and our extensive experiments indicate that significant improvement can be achieved by leveraging raw sensor data and data-driven learning.

CVJun 9, 2022
Reconstruct Face from Features Using GAN Generator as a Distribution Constraint

Xingbo Dong, Zhihui Miao, Lan Ma et al.

Face recognition based on the deep convolutional neural networks (CNN) shows superior accuracy performance attributed to the high discriminative features extracted. Yet, the security and privacy of the extracted features from deep learning models (deep features) have been often overlooked. This paper proposes the reconstruction of face images from deep features without accessing the CNN network configurations as a constrained optimization problem. Such optimization minimizes the distance between the features extracted from the original face image and the reconstructed face image. Instead of directly solving the optimization problem in the image space, we innovatively reformulate the problem by looking for a latent vector of a GAN generator, then use it to generate the face image. The GAN generator serves as a dual role in this novel framework, i.e., face distribution constraint of the optimization goal and a face generator. On top of the novel optimization task, we also propose an attack pipeline to impersonate the target user based on the generated face image. Our results show that the generated face images can achieve a state-of-the-art successful attack rate of 98.0\% on LFW under type-I attack @ FAR of 0.1\%. Our work sheds light on the biometric deployment to meet the privacy-preserving and security policies.

CVNov 11, 2025Code
Auto-US: An Ultrasound Video Diagnosis Agent Using Video Classification Framework and LLMs

Yuezhe Yang, Yiyue Guo, Wenjie Cai et al.

AI-assisted ultrasound video diagnosis presents new opportunities to enhance the efficiency and accuracy of medical imaging analysis. However, existing research remains limited in terms of dataset diversity, diagnostic performance, and clinical applicability. In this study, we propose \textbf{Auto-US}, an intelligent diagnosis agent that integrates ultrasound video data with clinical diagnostic text. To support this, we constructed \textbf{CUV Dataset} of 495 ultrasound videos spanning five categories and three organs, aggregated from multiple open-access sources. We developed \textbf{CTU-Net}, which achieves state-of-the-art performance in ultrasound video classification, reaching an accuracy of 86.73\% Furthermore, by incorporating large language models, Auto-US is capable of generating clinically meaningful diagnostic suggestions. The final diagnostic scores for each case exceeded 3 out of 5 and were validated by professional clinicians. These results demonstrate the effectiveness and clinical potential of Auto-US in real-world ultrasound applications. Code and data are available at: https://github.com/Bean-Young/Auto-US.

CVMar 2, 2022
A Generalized Approach for Cancellable Template and Its Realization for Minutia Cylinder-Code

Xingbo Dong, Zhe Jin, KokSheik Wong

Hashing technology gains much attention in protecting the biometric template lately. For instance, Index-of-Max (IoM), a recent reported hashing technique, is a ranking-based locality sensitive hashing technique, which illustrates the feasibility to protect the ordered and fixed-length biometric template. However, biometric templates are not always in the form of ordered and fixed-length, rather it may be an unordered and variable size point set e.g. fingerprint minutiae, which restricts the usage of the traditional hashing technology. In this paper, we proposed a generalized version of IoM hashing namely gIoM, and therefore the unordered and variable size biometric template can be used. We demonstrate a realization using a well-known variable size feature vector, fingerprint Minutia Cylinder-Code (MCC). The gIoM transforms MCC into index domain to form indexing-based feature representation. Consequently, the inversion of MCC from the transformed representation is computational infeasible, thus to achieve non-invertibility while the performance is preserved. Public fingerprint databases FVC2002 and FVC2004 are employed for experiment as benchmark to demonstrate a fair comparison with other methods. Moreover, the security and privacy analysis suggest that gIoM meets the criteria of template protection: non-invertibility, revocability, and non-linkability.

CVNov 11, 2025Code
UltraGS: Gaussian Splatting for Ultrasound Novel View Synthesis

Yuezhe Yang, Wenjie Cai, Dexin Yang et al.

Ultrasound imaging is a cornerstone of non-invasive clinical diagnostics, yet its limited field of view complicates novel view synthesis. We propose \textbf{UltraGS}, a Gaussian Splatting framework optimized for ultrasound imaging. First, we introduce a depth-aware Gaussian splatting strategy, where each Gaussian is assigned a learnable field of view, enabling accurate depth prediction and precise structural representation. Second, we design SH-DARS, a lightweight rendering function combining low-order spherical harmonics with ultrasound-specific wave physics, including depth attenuation, reflection, and scattering, to model tissue intensity accurately. Third, we contribute the Clinical Ultrasound Examination Dataset, a benchmark capturing diverse anatomical scans under real-world clinical protocols. Extensive experiments on three datasets demonstrate UltraGS's superiority, achieving state-of-the-art results in PSNR (up to 29.55), SSIM (up to 0.89), and MSE (as low as 0.002) while enabling real-time synthesis at 64.69 fps. The code and dataset are open-sourced at: https://github.com/Bean-Young/UltraGS.

LGSep 27, 2023
On the Computational Entanglement of Distant Features in Adversarial Machine Learning

YenLung Lai, Xingbo Dong, Zhe Jin

In this research, we introduce the concept of "computational entanglement," a phenomenon observed in overparameterized feedforward linear networks that enables the network to achieve zero loss by fitting random noise, even on previously unseen test samples. Analyzing this behavior through spacetime diagrams reveals its connection to length contraction, where both training and test samples converge toward a shared normalized point within a flat Riemannian manifold. Moreover, we present a novel application of computational entanglement in transforming a worst-case adversarial examples-inputs that are highly non-robust and uninterpretable to human observers-into outputs that are both recognizable and robust. This provides new insights into the behavior of non-robust features in adversarial example generation, underscoring the critical role of computational entanglement in enhancing model robustness and advancing our understanding of neural networks in adversarial contexts.

CVAug 15, 2024
A Multi-task Adversarial Attack Against Face Authentication

Hanrui Wang, Shuo Wang, Cunjian Chen et al.

Deep-learning-based identity management systems, such as face authentication systems, are vulnerable to adversarial attacks. However, existing attacks are typically designed for single-task purposes, which means they are tailored to exploit vulnerabilities unique to the individual target rather than being adaptable for multiple users or systems. This limitation makes them unsuitable for certain attack scenarios, such as morphing, universal, transferable, and counter attacks. In this paper, we propose a multi-task adversarial attack algorithm called MTADV that are adaptable for multiple users or systems. By interpreting these scenarios as multi-task attacks, MTADV is applicable to both single- and multi-task attacks, and feasible in the white- and gray-box settings. Furthermore, MTADV is effective against various face datasets, including LFW, CelebA, and CelebA-HQ, and can work with different deep learning models, such as FaceNet, InsightFace, and CurricularFace. Importantly, MTADV retains its feasibility as a single-task attack targeting a single user/system. To the best of our knowledge, MTADV is the first adversarial attack method that can target all of the aforementioned scenarios in one algorithm.

CVMar 17, 2025Code
Test-Time Domain Generalization via Universe Learning: A Multi-Graph Matching Approach for Medical Image Segmentation

Xingguo Lv, Xingbo Dong, Liwen Wang et al.

Despite domain generalization (DG) has significantly addressed the performance degradation of pre-trained models caused by domain shifts, it often falls short in real-world deployment. Test-time adaptation (TTA), which adjusts a learned model using unlabeled test data, presents a promising solution. However, most existing TTA methods struggle to deliver strong performance in medical image segmentation, primarily because they overlook the crucial prior knowledge inherent to medical images. To address this challenge, we incorporate morphological information and propose a framework based on multi-graph matching. Specifically, we introduce learnable universe embeddings that integrate morphological priors during multi-source training, along with novel unsupervised test-time paradigms for domain adaptation. This approach guarantees cycle-consistency in multi-matching while enabling the model to more effectively capture the invariant priors of unseen data, significantly mitigating the effects of domain shifts. Extensive experiments demonstrate that our method outperforms other state-of-the-art approaches on two medical image segmentation benchmarks for both multi-source and single-source domain generalization tasks. The source code is available at https://github.com/Yore0/TTDG-MGM.

CVApr 15, 2025Code
Explicit and Implicit Representations in AI-based 3D Reconstruction for Radiology: A Systematic Review

Yuezhe Yang, Boyu Yang, Yaqian Wang et al.

The demand for high-quality medical imaging in clinical practice and assisted diagnosis has made 3D reconstruction in radiological imaging a key research focus. Artificial intelligence (AI) has emerged as a promising approach to enhancing reconstruction accuracy while reducing acquisition and processing time, thereby minimizing patient radiation exposure and discomfort and ultimately benefiting clinical diagnosis. This review explores state-of-the-art AI-based 3D reconstruction algorithms in radiological imaging, categorizing them into explicit and implicit approaches based on their underlying principles. Explicit methods include point-based, volume-based, and Gaussian representations, while implicit methods encompass implicit prior embedding and neural radiance fields. Additionally, we examine commonly used evaluation metrics and benchmark datasets. Finally, we discuss the current state of development, key challenges, and future research directions in this evolving field. Our project available on: https://github.com/Bean-Young/AI4Radiology.

CRJun 23, 2020Code
SIAT: A Systematic Inter-Component Communication Analysis Technology for Detecting Threats on Android

Yupeng Hu, Zhe Jin, Wenjia Li et al.

In this paper, we present the design and implementation of a Systematic Inter-Component Communication Analysis Technology (SIAT) consisting of two key modules: \emph{Monitor} and \emph{Analyzer}. As an extension to the Android operating system at framework layer, the \emph{Monitor} makes the first attempt to revise the taint tag approach named TaintDroid both at method-level and file-level, to migrate it to the app-pair ICC paths identification through systemwide tracing and analysis of taint in intent both at the data flow and control flow. By taking over the taint logs offered by the \emph{Monitor}, the \emph{Analyzer} can build the accurate and integrated ICC models adopted to identify the specific threat models with the detection algorithms and predefined rules. Meanwhile, we employ the models' deflation technology to improve the efficiency of the \emph{Analyzer}. We implement the SIAT with Android Open Source Project and evaluate its performance through extensive experiments on well-known datasets and real-world apps. The experimental results show that, compared to state-of-the-art approaches, the SIAT can achieve about 25\%$\sim$200\% accuracy improvements with 1.0 precision and 0.98 recall at the cost of negligible runtime overhead. Moreover, the SIAT can identify two undisclosed cases of bypassing that prior technologies cannot detect and quite a few malicious ICC threats in real-world apps with lots of downloads on the Google Play market.

CVApr 12, 2024
IFViT: Interpretable Fixed-Length Representation for Fingerprint Matching via Vision Transformer

Yuhang Qiu, Honghui Chen, Xingbo Dong et al.

Determining dense feature points on fingerprints used in constructing deep fixed-length representations for accurate matching, particularly at the pixel level, is of significant interest. To explore the interpretability of fingerprint matching, we propose a multi-stage interpretable fingerprint matching network, namely Interpretable Fixed-length Representation for Fingerprint Matching via Vision Transformer (IFViT), which consists of two primary modules. The first module, an interpretable dense registration module, establishes a Vision Transformer (ViT)-based Siamese Network to capture long-range dependencies and the global context in fingerprint pairs. It provides interpretable dense pixel-wise correspondences of feature points for fingerprint alignment and enhances the interpretability in the subsequent matching stage. The second module takes into account both local and global representations of the aligned fingerprint pair to achieve an interpretable fixed-length representation extraction and matching. It employs the ViTs trained in the first module with the additional fully connected layer and retrains them to simultaneously produce the discriminative fixed-length representation and interpretable dense pixel-wise correspondences of feature points. Extensive experimental results on diverse publicly available fingerprint databases demonstrate that the proposed framework not only exhibits superior performance on dense registration and matching but also significantly promotes the interpretability in deep fixed-length representations-based fingerprint matching.

15.7CVApr 8
EventFace: Event-Based Face Recognition via Structure-Driven Spatiotemporal Modeling

Qingguo Meng, Xingbo Dong, Zhe Jin et al.

Event cameras offer a promising sensing modality for face recognition due to their inherent advantages in illumination robustness and privacy-friendliness. However, because event streams lack the stable photometric appearance relied upon by conventional RGB-based face recognition systems, we argue that event-based face recognition should model structure-driven spatiotemporal identity representations shaped by rigid facial motion and individual facial geometry. Since dedicated datasets for event-based face recognition remain lacking, we construct EFace, a small-scale event-based face dataset captured under rigid facial motion. To learn effectively from this limited event data, we further propose EventFace, a framework for event-based face recognition that integrates spatial structure and temporal dynamics for identity modeling. Specifically, we employ Low-Rank Adaptation (LoRA) to transfer structural facial priors from pretrained RGB face models to the event domain, thereby establishing a reliable spatial basis for identity modeling. Building on this foundation, we further introduce a Motion Prompt Encoder (MPE) to explicitly encode temporal features and a Spatiotemporal Modulator (STM) to fuse them with spatial features, thereby enhancing the representation of identity-relevant event patterns. Extensive experiments demonstrate that EventFace achieves the best performance among the evaluated baselines, with a Rank-1 identification rate of 94.19% and an equal error rate (EER) of 5.35%. Results further indicate that EventFace exhibits stronger robustness under degraded illumination than the competing methods. In addition, the learned representations exhibit reduced template reconstructability.

LGAug 20, 2025
Wormhole Dynamics in Deep Neural Networks

Yen-Lung Lai, Zhe Jin

This work investigates the generalization behavior of deep neural networks (DNNs), focusing on the phenomenon of "fooling examples," where DNNs confidently classify inputs that appear random or unstructured to humans. To explore this phenomenon, we introduce an analytical framework based on maximum likelihood estimation, without adhering to conventional numerical approaches that rely on gradient-based optimization and explicit labels. Our analysis reveals that DNNs operating in an overparameterized regime exhibit a collapse in the output feature space. While this collapse improves network generalization, adding more layers eventually leads to a state of degeneracy, where the model learns trivial solutions by mapping distinct inputs to the same output, resulting in zero loss. Further investigation demonstrates that this degeneracy can be bypassed using our newly derived "wormhole" solution. The wormhole solution, when applied to arbitrary fooling examples, reconciles meaningful labels with random ones and provides a novel perspective on shortcut learning. These findings offer deeper insights into DNN generalization and highlight directions for future research on learning dynamics in unsupervised settings to bridge the gap between theory and practice.

CVMar 9
Structure and Progress Aware Diffusion for Medical Image Segmentation

Siyuan Song, Guyue Hu, Chenglong Li et al.

Medical image segmentation is crucial for computer-aided diagnosis, which necessitates understanding both coarse morphological and semantic structures, as well as carving fine boundaries. The morphological and semantic structures in medical images are beneficial and stable clues for target understanding. While the fine boundaries of medical targets (like tumors and lesions) are usually ambiguous and noisy since lesion overlap, annotation uncertainty, and so on, making it not reliable to serve as early supervision. However, existing methods simultaneously learn coarse structures and fine boundaries throughout the training process. In this paper, we propose a structure and progress-aware diffusion (SPAD) for medical image segmentation, which consists of a semantic-concentrated diffusion (ScD) and a boundary-centralized diffusion (BcD) modulated by a progress-aware scheduler (PaS). Specifically, the semantic-concentrated diffusion introduces anchor-preserved target perturbation, which perturbs pixels within a medical target but preserves unaltered areas as semantic anchors, encouraging the model to infer noisy target areas from the surrounding semantic context. The boundary-centralized diffusion introduces progress-aware boundary noise, which blurs unreliable and ambiguous boundaries, thus compelling the model to focus on coarse but stable anatomical morphology and global semantics. Furthermore, the progress-aware scheduler gradually modulates noise intensity of the ScD and BcD forming a coarse-to-fine diffusion paradigm, which encourage focusing on coarse morphological and semantic structures during early target understanding stages and gradually shifting to fine target boundaries during later contour adjusting stages.

CVSep 2, 2025
Mix-modal Federated Learning for MRI Image Segmentation

Guyue Hu, Siyuan Song, Jingpeng Sun et al.

Magnetic resonance imaging (MRI) image segmentation is crucial in diagnosing and treating many diseases, such as brain tumors. Existing MRI image segmentation methods mainly fall into a centralized multimodal paradigm, which is inapplicable in engineering non-centralized mix-modal medical scenarios. In this situation, each distributed client (hospital) processes multiple mixed MRI modalities, and the modality set and image data for each client are diverse, suffering from extensive client-wise modality heterogeneity and data heterogeneity. In this paper, we first formulate non-centralized mix-modal MRI image segmentation as a new paradigm for federated learning (FL) that involves multiple modalities, called mix-modal federated learning (MixMFL). It distinguishes from existing multimodal federating learning (MulMFL) and cross-modal federating learning (CroMFL) paradigms. Then, we proposed a novel modality decoupling and memorizing mix-modal federated learning framework (MDM-MixMFL) for MRI image segmentation, which is characterized by a modality decoupling strategy and a modality memorizing mechanism. Specifically, the modality decoupling strategy disentangles each modality into modality-tailored and modality-shared information. During mix-modal federated updating, corresponding modality encoders undergo tailored and shared updating, respectively. It facilitates stable and adaptive federating aggregation of heterogeneous data and modalities from distributed clients. Besides, the modality memorizing mechanism stores client-shared modality prototypes dynamically refreshed from every modality-tailored encoder to compensate for incomplete modalities in each local client. It further benefits modality aggregation and fusion processes during mixmodal federated learning. Extensive experiments on two public datasets for MRI image segmentation demonstrate the effectiveness and superiority of our methods.

IVJul 1, 2025
EvRWKV: A Continuous Interactive RWKV Framework for Effective Event-Guided Low-Light Image Enhancement

Wenjie Cai, Qingguo Meng, Zhenyu Wang et al.

Event cameras offer significant potential for Low-light Image Enhancement (LLIE), yet existing fusion approaches are constrained by a fundamental dilemma: early fusion struggles with modality heterogeneity, while late fusion severs crucial feature correlations. To address these limitations, we propose EvRWKV, a novel framework that enables continuous cross-modal interaction through dual-domain processing, which mainly includes a Cross-RWKV Module to capture fine-grained temporal and cross-modal dependencies, and an Event Image Spectral Fusion Enhancer (EISFE) module to perform joint adaptive frequency-domain denoising and spatial-domain alignment. This continuous interaction maintains feature consistency from low-level textures to high-level semantics. Extensive experiments on the real-world SDE and SDSD datasets demonstrate that EvRWKV significantly outperforms only image-based methods by 1.79 dB and 1.85 dB in PSNR, respectively. To further validate the practical utility of our method for downstream applications, we evaluated its impact on semantic segmentation. Experiments demonstrate that images enhanced by EvRWKV lead to a significant 35.44% improvement in mIoU.

CVMar 15, 2025
Compose Your Aesthetics: Empowering Text-to-Image Models with the Principles of Art

Zhe Jin, Tat-Seng Chua

Text-to-Image (T2I) diffusion models (DM) have garnered widespread adoption due to their capability in generating high-fidelity outputs and accessibility to anyone able to put imagination into words. However, DMs are often predisposed to generate unappealing outputs, much like the random images on the internet they were trained on. Existing approaches to address this are founded on the implicit premise that visual aesthetics is universal, which is limiting. Aesthetics in the T2I context should be about personalization and we propose the novel task of aesthetics alignment which seeks to align user-specified aesthetics with the T2I generation output. Inspired by how artworks provide an invaluable perspective to approach aesthetics, we codify visual aesthetics using the compositional framework artists employ, known as the Principles of Art (PoA). To facilitate this study, we introduce CompArt, a large-scale compositional art dataset building on top of WikiArt with PoA analysis annotated by a capable Multimodal LLM. Leveraging the expressive power of LLMs and training a lightweight and transferrable adapter, we demonstrate that T2I DMs can effectively offer 10 compositional controls through user-specified PoA conditions. Additionally, we design an appropriate evaluation framework to assess the efficacy of our approach.

CVJan 11, 2022
Similarity-based Gray-box Adversarial Attack Against Deep Face Recognition

Hanrui Wang, Shuo Wang, Zhe Jin et al.

The majority of adversarial attack techniques perform well against deep face recognition when the full knowledge of the system is revealed (\emph{white-box}). However, such techniques act unsuccessfully in the gray-box setting where the face templates are unknown to the attackers. In this work, we propose a similarity-based gray-box adversarial attack (SGADV) technique with a newly developed objective function. SGADV utilizes the dissimilarity score to produce the optimized adversarial example, i.e., similarity-based adversarial attack. This technique applies to both white-box and gray-box attacks against authentication systems that determine genuine or imposter users using the dissimilarity score. To validate the effectiveness of SGADV, we conduct extensive experiments on face datasets of LFW, CelebA, and CelebA-HQ against deep face recognition models of FaceNet and InsightFace in both white-box and gray-box settings. The results suggest that the proposed method significantly outperforms the existing adversarial attack techniques in the gray-box setting. We hence summarize that the similarity-base approaches to develop the adversarial example could satisfactorily cater to the gray-box attack scenarios for de-authentication.

CRJun 23, 2020
Interpretable security analysis of cancellable biometrics using constrained-optimized similarity-based attack

Hanrui Wang, Xingbo Dong, Zhe Jin et al.

In cancellable biometrics (CB) schemes, template security is achieved by applying, mainly non-linear, transformations to the biometric template. The transformation is designed to preserve the template distance/similarity in the transformed domain. Despite its effectiveness, the security issues attributed to similarity preservation property of CB are underestimated. Dong et al. [BTAS'19], exploited the similarity preservation trait of CB and proposed a similarity-based attack with high successful attack rate. The similarity-based attack utilizes preimage that are generated from the protected biometric template for impersonation and perform cross matching. In this paper, we propose a constrained optimization similarity-based attack (CSA), which is improved upon Dong's genetic algorithm enabled similarity-based attack (GASA). The CSA applies algorithm-specific equality or inequality relations as constraints, to optimize preimage generation. We interpret the effectiveness of CSA from the supervised learning perspective. We identify such constraints then conduct extensive experiments to demonstrate CSA against CB with LFW face dataset. The results suggest that CSA is effective to breach IoM hashing and BioHashing security, and outperforms GASA significantly. Inferring from the above results, we further remark that, other than IoM and BioHashing, CSA is critical to other CB schemes as far as the constraints can be formulated. Furthermore, we reveal the correlation of hash code size and the attack performance of CSA.

CVJun 9, 2020
Multi-spectral Facial Landmark Detection

Jin Keong, Xingbo Dong, Zhe Jin et al.

Thermal face image analysis is favorable for certain circumstances. For example, illumination-sensitive applications, like nighttime surveillance; and privacy-preserving demanded access control. However, the inadequate study on thermal face image analysis calls for attention in responding to the industry requirements. Detecting facial landmark points are important for many face analysis tasks, such as face recognition, 3D face reconstruction, and face expression recognition. In this paper, we propose a robust neural network enabled facial landmark detection, namely Deep Multi-Spectral Learning (DMSL). Briefly, DMSL consists of two sub-models, i.e. face boundary detection, and landmark coordinates detection. Such an architecture demonstrates the capability of detecting the facial landmarks on both visible and thermal images. Particularly, the proposed DMSL model is robust in facial landmark detection where the face is partially occluded, or facing different directions. The experiment conducted on Eurecom's visible and thermal paired database shows the superior performance of DMSL over the state-of-the-art for thermal facial landmark detection. In addition to that, we have annotated a thermal face dataset with their respective facial landmark for the purpose of experimentation.

CVOct 17, 2019
On the Risk of Cancelable Biometrics

Xingbo Dong, Jaewoo Park, Zhe Jin et al.

Cancelable biometrics (CB) employs an irreversible transformation to convert the biometric features into transformed templates while preserving the relative distance between two templates for security and privacy protection. However, distance preservation invites unexpected security issues such as pre-image attacks, which are often neglected.This paper presents a generalized pre-image attack method and its extension version that operates on practical CB systems. We theoretically reveal that distance preservation property is a vulnerability source in the CB schemes. We then propose an empirical information leakage estimation algorithm to access the pre-image attack risk of the CB schemes. The experiments conducted with six CB schemes designed for the face, iris and fingerprint, demonstrate that the risks originating from the distance computed from two transformed templates significantly compromise the security of CB schemes. Our work reveals the potential risk of existing CB systems theoretically and experimentally.

CVMay 8, 2019
A Genetic Algorithm Enabled Similarity-Based Attack on Cancellable Biometrics

Xingbo Dong, Zhe Jin, Andrew Teoh Beng Jin

Cancellable biometrics (CB) as a means for biometric template protection approach refers to an irreversible yet similarity preserving transformation on the original template. With similarity preserving property, the matching between template and query instance can be performed in the transform domain without jeopardizing accuracy performance. Unfortunately, this trait invites a class of attack, namely similarity-based attack (SA). SA produces a preimage, an inverse of transformed template, which can be exploited for impersonation and cross-matching. In this paper, we propose a Genetic Algorithm enabled similarity-based attack framework (GASAF) to demonstrate that CB schemes whose possess similarity preserving property are highly vulnerable to similarity-based attack. Besides that, a set of new metrics is designed to measure the effectiveness of the similarity-based attack. We conduct the experiment on two representative CB schemes, i.e. BioHashing and Bloom-filter. The experimental results attest the vulnerability under this type of attack.

CVSep 28, 2018
A Symmetric Keyring Encryption Scheme for Biometric Cryptosystems

Yen-Lung Lai, Jung-Yeon Hwang, Zhe Jin et al.

In this paper, we propose a novel biometric cryptosystem for vectorial biometrics named symmetric keyring encryption (SKE) inspired by Rivest's keyring model (2016). Unlike conventional biometric secret-binding primitives, such as fuzzy commitment and fuzzy vault, the proposed scheme reframes the biometric secret-binding problem as a fuzzy symmetric encryption problem with a notion called resilient vector pair. In this study, the pair resembles the encryption-decryption key pair in symmetric key cryptosystems. This notion is realized using the index of maximum hashed vectors - a special instance of the ranking-based locality-sensitive hashing function. With a simple filtering mechanism and [m,k] Shamir's secret-sharing scheme, we show that SKE, both in theoretical and empirical evaluation, can retrieve the exact secret with overwhelming probability for a genuine input yet negligible probability for an imposter input. Though SKE can be applied to any vectorial biometrics, we adopt the fingerprint vector as a case of study in this work. The experiments have been performed under several subsets of FVC 2002, 2004, and 2006 datasets. We formalize and analyze the threat model of SKE that encloses several major security attacks.

CVMar 16, 2017
Ranking Based Locality Sensitive Hashing Enabled Cancelable Biometrics: Index-of-Max Hashing

Zhe Jin, Yen-Lung Lai, Jung-Yeon Hwang et al.

In this paper, we propose a ranking based locality sensitive hashing inspired two-factor cancelable biometrics, dubbed "Index-of-Max" (IoM) hashing for biometric template protection. With externally generated random parameters, IoM hashing transforms a real-valued biometric feature vector into discrete index (max ranked) hashed code. We demonstrate two realizations from IoM hashing notion, namely Gaussian Random Projection based and Uniformly Random Permutation based hashing schemes. The discrete indices representation nature of IoM hashed codes enjoy serveral merits. Firstly, IoM hashing empowers strong concealment to the biometric information. This contributes to the solid ground of non-invertibility guarantee. Secondly, IoM hashing is insensitive to the features magnitude, hence is more robust against biometric features variation. Thirdly, the magnitude-independence trait of IoM hashing makes the hash codes being scale-invariant, which is critical for matching and feature alignment. The experimental results demonstrate favorable accuracy performance on benchmark FVC2002 and FVC2004 fingerprint databases. The analyses justify its resilience to the existing and newly introduced security and privacy attacks as well as satisfy the revocability and unlinkability criteria of cancelable biometrics.

CVJul 23, 2016
Rank Correlation Measure: A Representational Transformation for Biometric Template Protection

Zhe Jin, Yen-Lung Lai, Andrew Beng Jin Teoh

Despite a variety of theoretical-sound techniques have been proposed for biometric template protection, there is rarely practical solution that guarantees non-invertibility, cancellability, non-linkability and performance simultaneously. In this paper, a ranking-based representational transformation is proposed for fingerprint templates. The proposed method transforms a real-valued feature vector into index code such that the pairwise-order measure in the resultant codes are closely correlated with rank similarity measure. Such a ranking based technique offers two major merits: 1) Resilient to noises/perturbations in numeric values; and 2) Highly nonlinear embedding based on partial order statistics. The former takes care of the accuracy performance mitigating numeric noises/perturbations while the latter offers strong non-invertible transformation via nonlinear feature embedding from Euclidean to Rank space that leads to toughness in inversion. The experimental results demonstrate reasonable accuracy performance on benchmark FVC2002 and FVC2004 fingerprint databases, thus confirm the proposition of the rank correlation. Moreover, the security and privacy analysis justify the strong capability against the existing major privacy attacks.