Guang Hua

CV
h-index10
8papers
22citations
Novelty64%
AI Score49

8 Papers

CRSep 5, 2024
A Key-Driven Framework for Identity-Preserving Face Anonymization

Miaomiao Wang, Guang Hua, Sheng Li et al.

Virtual faces are crucial content in the metaverse. Recently, attempts have been made to generate virtual faces for privacy protection. Nevertheless, these virtual faces either permanently remove the identifiable information or map the original identity into a virtual one, which loses the original identity forever. In this study, we first attempt to address the conflict between privacy and identifiability in virtual faces, where a key-driven face anonymization and authentication recognition (KFAAR) framework is proposed. Concretely, the KFAAR framework consists of a head posture-preserving virtual face generation (HPVFG) module and a key-controllable virtual face authentication (KVFA) module. The HPVFG module uses a user key to project the latent vector of the original face into a virtual one. Then it maps the virtual vectors to obtain an extended encoding, based on which the virtual face is generated. By simultaneously adding a head posture and facial expression correction module, the virtual face has the same head posture and facial expression as the original face. During the authentication, we propose a KVFA module to directly recognize the virtual faces using the correct user key, which can obtain the original identity without exposing the original face image. We also propose a multi-task learning objective to train HPVFG and KVFA. Extensive experiments demonstrate the advantages of the proposed HPVFG and KVFA modules, which effectively achieve both facial anonymity and identifiability.

CVFeb 24
RecoverMark: Robust Watermarking for Localization and Recovery of Manipulated Faces

Haonan An, Xiaohui Ye, Guang Hua et al.

The proliferation of AI-generated content has facilitated sophisticated face manipulation, severely undermining visual integrity and posing unprecedented challenges to intellectual property. In response, a common proactive defense leverages fragile watermarks to detect, localize, or even recover manipulated regions. However, these methods always assume an adversary unaware of the embedded watermark, overlooking their inherent vulnerability to watermark removal attacks. Furthermore, this fragility is exacerbated in the commonly used dual-watermark strategy that adds a robust watermark for image ownership verification, where mutual interference and limited embedding capacity reduce the fragile watermark's effectiveness. To address the gap, we propose RecoverMark, a watermarking framework that achieves robust manipulation localization, content recovery, and ownership verification simultaneously. Our key insight is twofold. First, we exploit a critical real-world constraint: an adversary must preserve the background's semantic consistency to avoid visual detection, even if they apply global, imperceptible watermark removal attacks. Second, using the image's own content (face, in this paper) as the watermark enhances extraction robustness. Based on these insights, RecoverMark treats the protected face content itself as the watermark and embeds it into the surrounding background. By designing a robust two-stage training paradigm with carefully crafted distortion layers that simulate comprehensive potential attacks and a progressive training strategy, RecoverMark achieves a robust watermark embedding in no fragile manner for image manipulation localization, recovery, and image IP protection simultaneously. Extensive experiments demonstrate the proposed RecoverMark's robustness against both seen and unseen attacks and its generalizability to in-distribution and out-of-distribution data.

CVApr 8, 2024Code
Detecting Every Object from Events

Haitian Zhang, Chang Xu, Xinya Wang et al.

Object detection is critical in autonomous driving, and it is more practical yet challenging to localize objects of unknown categories: an endeavour known as Class-Agnostic Object Detection (CAOD). Existing studies on CAOD predominantly rely on ordinary cameras, but these frame-based sensors usually have high latency and limited dynamic range, leading to safety risks in real-world scenarios. In this study, we turn to a new modality enabled by the so-called event camera, featured by its sub-millisecond latency and high dynamic range, for robust CAOD. We propose Detecting Every Object in Events (DEOE), an approach tailored for achieving high-speed, class-agnostic open-world object detection in event-based vision. Built upon the fast event-based backbone: recurrent vision transformer, we jointly consider the spatial and temporal consistencies to identify potential objects. The discovered potential objects are assimilated as soft positive samples to avoid being suppressed as background. Moreover, we introduce a disentangled objectness head to separate the foreground-background classification and novel object discovery tasks, enhancing the model's generalization in localizing novel objects while maintaining a strong ability to filter out the background. Extensive experiments confirm the superiority of our proposed DEOE in comparison with three strong baseline methods that integrate the state-of-the-art event-based object detector with advancements in RGB-based CAOD. Our code is available at https://github.com/Hatins/DEOE.

CVMay 16, 2024
Box-Free Model Watermarks Are Prone to Black-Box Removal Attacks

Haonan An, Guang Hua, Zhiping Lin et al.

Box-free model watermarking is an emerging technique to safeguard the intellectual property of deep learning models, particularly those for low-level image processing tasks. Existing works have verified and improved its effectiveness in several aspects. However, in this paper, we reveal that box-free model watermarking is prone to removal attacks, even under the real-world threat model such that the protected model and the watermark extractor are in black boxes. Under this setting, we carry out three studies. 1) We develop an extractor-gradient-guided (EGG) remover and show its effectiveness when the extractor uses ReLU activation only. 2) More generally, for an unknown extractor, we leverage adversarial attacks and design the EGG remover based on the estimated gradients. 3) Under the most stringent condition that the extractor is inaccessible, we design a transferable remover based on a set of private proxy models. In all cases, the proposed removers can successfully remove embedded watermarks while preserving the quality of the processed images, and we also demonstrate that the EGG remover can even replace the watermarks. Extensive experimental results verify the effectiveness and generalizability of the proposed attacks, revealing the vulnerabilities of the existing box-free methods and calling for further research.

CRJul 24, 2025
NWaaS: Nonintrusive Watermarking as a Service for X-to-Image DNN

Haonan An, Guang Hua, Yu Guo et al.

The intellectual property of deep neural network (DNN) models can be protected with DNN watermarking, which embeds copyright watermarks into model parameters (white-box), model behavior (black-box), or model outputs (box-free), and the watermarks can be subsequently extracted to verify model ownership or detect model theft. Despite recent advances, these existing methods are inherently intrusive, as they either modify the model parameters or alter the structure. This natural intrusiveness raises concerns about watermarking-induced shifts in model behavior and the additional cost of fine-tuning, further exacerbated by the rapidly growing model size. As a result, model owners are often reluctant to adopt DNN watermarking in practice, which limits the development of practical Watermarking as a Service (WaaS) systems. To address this issue, we introduce Nonintrusive Watermarking as a Service (NWaaS), a novel trustless paradigm designed for X-to-Image models, in which we hypothesize that with the model untouched, an owner-defined watermark can still be extracted from model outputs. Building on this concept, we propose ShadowMark, a concrete implementation of NWaaS which addresses critical deployment challenges by establishing a robust and nonintrusive side channel in the protected model's black-box API, leveraging a key encoder and a watermark decoder. It is significantly distinctive from existing solutions by attaining the so-called absolute fidelity and being applicable to different DNN architectures, while being also robust against existing attacks, eliminating the fidelity-robustness trade-off. Extensive experiments on image-to-image, noise-to-image, noise-and-text-to-image, and text-to-image models, demonstrate the efficacy and practicality of ShadowMark for real-world deployment of nonintrusive DNN watermarking.

CVFeb 28, 2025
Decoder Gradient Shield: Provable and High-Fidelity Prevention of Gradient-Based Box-Free Watermark Removal

Haonan An, Guang Hua, Zhengru Fang et al.

The intellectual property of deep image-to-image models can be protected by the so-called box-free watermarking. It uses an encoder and a decoder, respectively, to embed into and extract from the model's output images invisible copyright marks. Prior works have improved watermark robustness, focusing on the design of better watermark encoders. In this paper, we reveal an overlooked vulnerability of the unprotected watermark decoder which is jointly trained with the encoder and can be exploited to train a watermark removal network. To defend against such an attack, we propose the decoder gradient shield (DGS) as a protection layer in the decoder API to prevent gradient-based watermark removal with a closed-form solution. The fundamental idea is inspired by the classical adversarial attack, but is utilized for the first time as a defensive mechanism in the box-free model watermarking. We then demonstrate that DGS can reorient and rescale the gradient directions of watermarked queries and stop the watermark remover's training loss from converging to the level without DGS, while retaining decoder output image quality. Experimental results verify the effectiveness of proposed method. Code of paper will be made available upon acceptance.

SDNov 6, 2020
Robust ENF Estimation Based on Harmonic Enhancement and Maximum Weight Clique

Guang Hua, Han Liao, Haijian Zhang et al.

We present a framework for robust electric network frequency (ENF) extraction from real-world audio recordings, featuring multi-tone ENF harmonic enhancement and graph-based optimal harmonic selection. Specifically, We first extend the recently developed single-tone ENF signal enhancement method to the multi-tone scenario and propose a harmonic robust filtering algorithm (HRFA). It can respectively enhance each harmonic component without cross-component interference, thus further alleviating the effects of unwanted noise and audio content on the much weaker ENF signal. In addition, considering the fact that some harmonic components could be severely corrupted even after enhancement, disturbing rather than facilitating ENF estimation, we propose a graph-based harmonic selection algorithm (GHSA), which finds the optimal combination of harmonic components for more accurate ENF estimation. Noticeably, the harmonic selection problem is equivalently formulated as a maximum weight clique (MWC) problem in graph theory, and the Bron-Kerbosch algorithm (BKA) is adopted in the GHSA. With the enhanced and optimally selected harmonic components, both the existing maximum likelihood estimator (MLE) and weighted MLE (WMLE) are incorporated to yield the final ENF estimation results. The proposed framework is extensively evaluated using both synthetic signals and our ENF-WHU dataset consisting of $130$ real-world audio recordings, demonstrating substantially improved capability of extracting the ENF from realistically noisy observations over the existing single- and multi-tone competitors. This work further improves the applicability of the ENF as a forensic criterion in real-world situations.

CROct 31, 2020
Reliability of Power System Frequency on Times-Stamping Digital Recordings

Guang Hua, Qingyi Wang, Dengpan Ye et al.

Power system frequency could be captured by digital recordings and extracted to compare with a reference database for forensic time-stamp verification. It is known as the electric network frequency (ENF) criterion, enabled by the properties of random fluctuation and intra-grid consistency. In essence, this is a task of matching a short random sequence within a long reference, and the reliability of this criterion is mainly concerned with whether this match could be unique and correct. In this paper, we comprehensively analyze the factors affecting the reliability of ENF matching, including length of test recording, length of reference, temporal resolution, and signal-to-noise ratio (SNR). For synthetic analysis, we incorporate the first-order autoregressive (AR) ENF model and propose an efficient time-frequency domain (TFD) noisy ENF synthesis method. Then, the reliability analysis schemes for both synthetic and real-world data are respectively proposed. Through a comprehensive study we reveal that while the SNR is an important external factor to determine whether time-stamp verification is viable, the length of test recording is the most important inherent factor, followed by the length of reference. However, the temporal resolution has little impact on the matching process.