Chun-Shien Lu

CV
h-index17
19papers
84citations
Novelty58%
AI Score48

19 Papers

CVJul 14, 2024Code
Defending Against Repetitive Backdoor Attacks on Semi-supervised Learning through Lens of Rate-Distortion-Perception Trade-off

Cheng-Yi Lee, Ching-Chia Kao, Cheng-Han Yeh et al.

Semi-supervised learning (SSL) has achieved remarkable performance with a small fraction of labeled data by leveraging vast amounts of unlabeled data from the Internet. However, this large pool of untrusted data is extremely vulnerable to data poisoning, leading to potential backdoor attacks. Current backdoor defenses are not yet effective against such a vulnerability in SSL. In this study, we propose a novel method, Unlabeled Data Purification (UPure), to disrupt the association between trigger patterns and target classes by introducing perturbations in the frequency domain. By leveraging the Rate-Distortion-Perception (RDP) trade-off, we further identify the frequency band, where the perturbations are added, and justify this selection. Notably, UPure purifies poisoned unlabeled data without the need of extra clean labeled data. Extensive experiments on four benchmark datasets and five SSL algorithms demonstrate that UPure effectively reduces the attack success rate from 99.78% to 0% while maintaining model accuracy. Code is available here: \url{https://github.com/chengyi-chris/UPure}.

CVJul 26, 2024
LookupForensics: A Large-Scale Multi-Task Dataset for Multi-Phase Image-Based Fact Verification

Shuhan Cui, Huy H. Nguyen, Trung-Nghia Le et al.

Amid the proliferation of forged images, notably the tsunami of deepfake content, extensive research has been conducted on using artificial intelligence (AI) to identify forged content in the face of continuing advancements in counterfeiting technologies. We have investigated the use of AI to provide the original authentic image after deepfake detection, which we believe is a reliable and persuasive solution. We call this "image-based automated fact verification," a name that originated from a text-based fact-checking system used by journalists. We have developed a two-phase open framework that integrates detection and retrieval components. Additionally, inspired by a dataset proposed by Meta Fundamental AI Research, we further constructed a large-scale dataset that is specifically designed for this task. This dataset simulates real-world conditions and includes both content-preserving and content-aware manipulations that present a range of difficulty levels and have potential for ongoing research. This multi-task dataset is fully annotated, enabling it to be utilized for sub-tasks within the forgery identification and fact retrieval domains. This paper makes two main contributions: (1) We introduce a new task, "image-based automated fact verification," and present a novel two-phase open framework combining "forgery identification" and "fact retrieval." (2) We present a large-scale dataset tailored for this new task that features various hand-crafted image edits and machine learning-driven manipulations, with extensive annotations suitable for various sub-tasks. Extensive experimental results validate its practicality for fact verification research and clarify its difficulty levels for various sub-tasks.

CRApr 25, 2025Code
DiffMI: Breaking Face Recognition Privacy via Diffusion-Driven Training-Free Model Inversion

Hanrui Wang, Shuo Wang, Chun-Shien Lu et al.

Face recognition poses serious privacy risks due to its reliance on sensitive and immutable biometric data. While modern systems mitigate privacy risks by mapping facial images to embeddings (commonly regarded as privacy-preserving), model inversion attacks reveal that identity information can still be recovered, exposing critical vulnerabilities. However, existing attacks are often computationally expensive and lack generalization, especially those requiring target-specific training. Even training-free approaches suffer from limited identity controllability, hindering faithful reconstruction of nuanced or unseen identities. In this work, we propose DiffMI, the first diffusion-driven, training-free model inversion attack. DiffMI introduces a novel pipeline combining robust latent code initialization, a ranked adversarial refinement strategy, and a statistically grounded, confidence-aware optimization objective. DiffMI applies directly to unseen target identities and face recognition models, offering greater adaptability than training-dependent approaches while significantly reducing computational overhead. Our method achieves 84.42%--92.87% attack success rates against inversion-resilient systems and outperforms the best prior training-free GAN-based approach by 4.01%--9.82%. The implementation is available at https://github.com/azrealwang/DiffMI.

CVJan 24, 2025Code
GreedyPixel: Fine-Grained Black-Box Adversarial Attack Via Greedy Algorithm

Hanrui Wang, Ching-Chun Chang, Chun-Shien Lu et al.

Deep neural networks are highly vulnerable to adversarial examples, which are inputs with small, carefully crafted perturbations that cause misclassification -- making adversarial attacks a critical tool for evaluating robustness. Existing black-box methods typically entail a trade-off between precision and flexibility: pixel-sparse attacks (e.g., single- or few-pixel attacks) provide fine-grained control but lack adaptability, whereas patch- or frequency-based attacks improve efficiency or transferability, but at the cost of producing larger and less precise perturbations. We present GreedyPixel, a fine-grained black-box attack method that performs brute-force-style, per-pixel greedy optimization guided by a surrogate-derived priority map and refined by means of query feedback. It evaluates each coordinate directly without any gradient information, guaranteeing monotonic loss reduction and convergence to a coordinate-wise optimum, while also yielding near white-box-level precision and pixel-wise sparsity and perceptual quality. On the CIFAR-10 and ImageNet datasets, spanning convolutional neural networks (CNNs) and Transformer models, GreedyPixel achieved state-of-the-art success rates with visually imperceptible perturbations, effectively bridging the gap between black-box practicality and white-box performance. The implementation is available at https://github.com/azrealwang/greedypixel.

CVAug 21, 2024
BadVim: Unveiling Backdoor Threats in Visual State Space Model

Cheng-Yi Lee, Yu-Hsuan Chiang, Zhong-You Wu et al.

Visual State Space Models (VSSM) have shown remarkable performance in various computer vision tasks. However, backdoor attacks pose significant security challenges, causing compromised models to predict target labels when specific triggers are present while maintaining normal behavior on benign samples. In this paper, we investigate the robustness of VSSMs against backdoor attacks. Specifically, we delicately design a novel framework for VSSMs, dubbed BadVim, which utilizes low-rank perturbations on state-wise to uncover their impact on state transitions during training. By poisoning only $0.3\%$ of the training data, our attacks cause any trigger-embedded input to be misclassified to the targeted class with a high attack success rate (over 97%) at inference time. Our findings suggest that the state-space representation property of VSSMs, which enhances model capability, may also contribute to its vulnerability to backdoor attacks. Our attack exhibits effectiveness across three datasets, even bypassing state-of-the-art defenses against such attacks. Extensive experiments show that the backdoor robustness of VSSMs is comparable to that of Transformers (ViTs) and superior to that of Convolutional Neural Networks (CNNs). We believe our findings will prompt the community to reconsider the trade-offs between performance and robustness in model design.

CVFeb 26
Beyond Detection: Multi-Scale Hidden-Code for Natural Image Deepfake Recovery and Factual Retrieval

Yuan-Chih Chen, Chun-Shien Lu

Recent advances in image authenticity have primarily focused on deepfake detection and localization, leaving recovery of tampered contents for factual retrieval relatively underexplored. We propose a unified hidden-code recovery framework that enables both retrieval and restoration from post-hoc and in-generation watermarking paradigms. Our method encodes semantic and perceptual information into a compact hidden-code representation, refined through multi-scale vector quantization, and enhances contextual reasoning via conditional Transformer modules. To enable systematic evaluation for natural images, we construct ImageNet-S, a benchmark that provides paired image-label factual retrieval tasks. Extensive experiments on ImageNet-S demonstrate that our method exhibits promising retrieval and reconstruction performance while remaining fully compatible with diverse watermarking pipelines. This framework establishes a foundation for general-purpose image recovery beyond detection and localization.

CVNov 15, 2025
FedSDA: Federated Stain Distribution Alignment for Non-IID Histopathological Image Classification

Cheng-Chang Tsai, Kai-Wen Cheng, Chun-Shien Lu

Federated learning (FL) has shown success in collaboratively training a model among decentralized data resources without directly sharing privacy-sensitive training data. Despite recent advances, non-IID (non-independent and identically distributed) data poses an inevitable challenge that hinders the use of FL. In this work, we address the issue of non-IID histopathological images with feature distribution shifts from an intuitive perspective that has only received limited attention. Specifically, we address this issue from the perspective of data distribution by solely adjusting the data distributions of all clients. Building on the success of diffusion models in fitting data distributions and leveraging stain separation to extract the pivotal features that are closely related to the non-IID properties of histopathological images, we propose a Federated Stain Distribution Alignment (FedSDA) method. FedSDA aligns the stain distribution of each client with a target distribution in an FL framework to mitigate distribution shifts among clients. Furthermore, considering that training diffusion models on raw data in FL has been shown to be susceptible to privacy leakage risks, we circumvent this problem while still effectively achieving alignment. Extensive experimental results show that FedSDA is not only effective in improving baselines that focus on mitigating disparities across clients' model updates but also outperforms baselines that address the non-IID data issues from the perspective of data distribution. We show that FedSDA provides valuable and practical insights for the computational pathology community.

CVDec 13, 2023
Generalized Deepfakes Detection with Reconstructed-Blended Images and Multi-scale Feature Reconstruction Network

Yuyang Sun, Huy H. Nguyen, Chun-Shien Lu et al.

The growing diversity of digital face manipulation techniques has led to an urgent need for a universal and robust detection technology to mitigate the risks posed by malicious forgeries. We present a blended-based detection approach that has robust applicability to unseen datasets. It combines a method for generating synthetic training samples, i.e., reconstructed blended images, that incorporate potential deepfake generator artifacts and a detection model, a multi-scale feature reconstruction network, for capturing the generic boundary artifacts and noise distribution anomalies brought about by digital face manipulations. Experiments demonstrated that this approach results in better performance in both cross-manipulation detection and cross-dataset detection on unseen data.

LGFeb 2, 2025
Safety Alignment Depth in Large Language Models: A Markov Chain Perspective

Ching-Chia Kao, Chia-Mu Yu, Chun-Shien Lu et al.

Large Language Models (LLMs) are increasingly adopted in high-stakes scenarios, yet their safety mechanisms often remain fragile. Simple jailbreak prompts or even benign fine-tuning can bypass these protocols, underscoring the need to understand where and how they fail. Recent findings suggest that vulnerabilities emerge when alignment is confined to only the initial output tokens. Unfortunately, even with the introduction of deep safety alignment, determining the optimal safety depth remains an unresolved challenge. By leveraging the equivalence between autoregressive language models and Markov chains, this paper offers the first theoretical result on how to identify the ideal depth for safety alignment, and demonstrates how permutation-based data augmentation can tighten these bounds. Crucially, we reveal a fundamental interaction between alignment depth and ensemble width-indicating that broader ensembles can compensate for shallower alignments. These insights provide a theoretical foundation for designing more robust, scalable safety strategies that complement existing alignment approaches, opening new avenues for research into safer, more reliable LLMs.

LGOct 22, 2024
Test-time Adversarial Defense with Opposite Adversarial Path and High Attack Time Cost

Cheng-Han Yeh, Kuanchun Yu, Chun-Shien Lu

Deep learning models are known to be vulnerable to adversarial attacks by injecting sophisticated designed perturbations to input data. Training-time defenses still exhibit a significant performance gap between natural accuracy and robust accuracy. In this paper, we investigate a new test-time adversarial defense method via diffusion-based recovery along opposite adversarial paths (OAPs). We present a purifier that can be plugged into a pre-trained model to resist adversarial attacks. Different from prior arts, the key idea is excessive denoising or purification by integrating the opposite adversarial direction with reverse diffusion to push the input image further toward the opposite adversarial direction. For the first time, we also exemplify the pitfall of conducting AutoAttack (Rand) for diffusion-based defense methods. Through the lens of time complexity, we examine the trade-off between the effectiveness of adaptive attack and its computation complexity against our defense. Experimental evaluation along with time cost analysis verifies the effectiveness of the proposed method.

CRJul 28, 2025
MaXsive: High-Capacity and Robust Training-Free Generative Image Watermarking in Diffusion Models

Po-Yuan Mao, Cheng-Chang Tsai, Chun-Shien Lu

The great success of the diffusion model in image synthesis led to the release of gigantic commercial models, raising the issue of copyright protection and inappropriate content generation. Training-free diffusion watermarking provides a low-cost solution for these issues. However, the prior works remain vulnerable to rotation, scaling, and translation (RST) attacks. Although some methods employ meticulously designed patterns to mitigate this issue, they often reduce watermark capacity, which can result in identity (ID) collusion. To address these problems, we propose MaXsive, a training-free diffusion model generative watermarking technique that has high capacity and robustness. MaXsive best utilizes the initial noise to watermark the diffusion model. Moreover, instead of using a meticulously repetitive ring pattern, we propose injecting the X-shape template to recover the RST distortions. This design significantly increases robustness without losing any capacity, making ID collusion less likely to happen. The effectiveness of MaXsive has been verified on two well-known watermarking benchmarks under the scenarios of verification and identification.

CVJan 28, 2025
Adversarial Masked Autoencoder Purifier with Defense Transferability

Yuan-Chih Chen, Chun-Shien Lu

The study of adversarial defense still struggles to combat with advanced adversarial attacks. In contrast to most prior studies that rely on the diffusion model for test-time defense to remarkably increase the inference time, we propose Masked AutoEncoder Purifier (MAEP), which integrates Masked AutoEncoder (MAE) into an adversarial purifier framework for test-time purification. While MAEP achieves promising adversarial robustness, it particularly features model defense transferability and attack generalization without relying on using additional data that is different from the training dataset. To our knowledge, MAEP is the first study of adversarial purifier based on MAE. Extensive experimental results demonstrate that our method can not only maintain clear accuracy with only a slight drop but also exhibit a close gap between the clean and robust accuracy. Notably, MAEP trained on CIFAR10 achieves state-of-the-art performance even when tested directly on ImageNet, outperforming existing diffusion-based models trained specifically on ImageNet.

CVApr 5, 2021
Perceptual Indistinguishability-Net (PI-Net): Facial Image Obfuscation with Manipulable Semantics

Jia-Wei Chen, Li-Ju Chen, Chia-Mu Yu et al.

With the growing use of camera devices, the industry has many image datasets that provide more opportunities for collaboration between the machine learning community and industry. However, the sensitive information in the datasets discourages data owners from releasing these datasets. Despite recent research devoted to removing sensitive information from images, they provide neither meaningful privacy-utility trade-off nor provable privacy guarantees. In this study, with the consideration of the perceptual similarity, we propose perceptual indistinguishability (PI) as a formal privacy notion particularly for images. We also propose PI-Net, a privacy-preserving mechanism that achieves image obfuscation with PI guarantee. Our study shows that PI-Net achieves significantly better privacy utility trade-off through public image data.

LGNov 28, 2020
Deterministic Certification to Adversarial Attacks via Bernstein Polynomial Approximation

Ching-Chia Kao, Jhe-Bang Ko, Chun-Shien Lu

Randomized smoothing has established state-of-the-art provable robustness against $\ell_2$ norm adversarial attacks with high probability. However, the introduced Gaussian data augmentation causes a severe decrease in natural accuracy. We come up with a question, "Is it possible to construct a smoothed classifier without randomization while maintaining natural accuracy?". We find the answer is definitely yes. We study how to transform any classifier into a certified robust classifier based on a popular and elegant mathematical tool, Bernstein polynomial. Our method provides a deterministic algorithm for decision boundary smoothing. We also introduce a distinctive approach of norm-independent certified robustness via numerical solutions of nonlinear systems of equations. Theoretical analyses and experimental results indicate that our method is promising for classifier smoothing and robustness certification.

CVOct 22, 2020
QISTA-Net: DNN Architecture to Solve $\ell_q$-norm Minimization Problem and Image Compressed Sensing

Gang-Xuan Lin, Shih-Wei Hu, Chun-Shien Lu

In this paper, we reformulate the non-convex $\ell_q$-norm minimization problem with $q\in(0,1)$ into a 2-step problem, which consists of one convex and one non-convex subproblems, and propose a novel iterative algorithm called QISTA ($\ell_q$-ISTA) to solve the $\left(\ell_q\right)$-problem. By taking advantage of deep learning in accelerating optimization algorithms, together with the speedup strategy that using the momentum from all previous layers in the network, we propose a learning-based method, called QISTA-Net-s, to solve the sparse signal reconstruction problem. Extensive experimental comparisons demonstrate that the QISTA-Net-s yield better reconstruction qualities than state-of-the-art $\ell_1$-norm optimization (plus learning) algorithms even if the original sparse signal is noisy. On the other hand, based on the network architecture associated with QISTA, with considering the use of convolution layers, we proposed the QISTA-Net-n for solving the image CS problem, and the performance of the reconstruction still outperforms most of the state-of-the-art natural images reconstruction methods. QISTA-Net-n is designed in unfolding QISTA and adding the convolutional operator as the dictionary. This makes QISTA-Net-s interpretable. We provide complete experimental results that QISTA-Net-s and QISTA-Net-n contribute the better reconstruction performance than the competing.

ITJan 24, 2016
Fast Binary Embedding via Circulant Downsampled Matrix -- A Data-Independent Approach

Sung-Hsien Hsieh, Chun-Shien Lu, Soo-Chang Pei

Binary embedding of high-dimensional data aims to produce low-dimensional binary codes while preserving discriminative power. State-of-the-art methods often suffer from high computation and storage costs. We present a simple and fast embedding scheme by first downsampling N-dimensional data into M-dimensional data and then multiplying the data with an MxM circulant matrix. Our method requires O(N +M log M) computation and O(N) storage costs. We prove if data have sparsity, our scheme can achieve similarity-preserving well. Experiments further demonstrate that though our method is cost-effective and fast, it still achieves comparable performance in image applications.

DSSep 16, 2015
Fast Template Matching by Subsampled Circulant Matrix

Sung-Hsien Hsieh, Chun-Shien Lu, and Soo-Chang Pei

Template matching is widely used for many applications in image and signal processing and usually is time-critical. Traditional methods usually focus on how to reduce the search locations by coarse-to-fine strategy or full search combined with pruning strategy. However, the computation cost of those methods is easily dominated by the size of signal N instead of that of template K. This paper proposes a probabilistic and fast matching scheme, which computation costs requires O(N) additions and O(K \log K) multiplications, based on cross-correlation. The nuclear idea is to first downsample signal, which size becomes O(K), and then subsequent operations only involves downsampled signals. The probability of successful match depends on cross-correlation between signal and the template. We show the sufficient condition for successful match and prove that the probability is high for binary signals with K^2/log K >= O(N). The experiments shows this proposed scheme is fast and efficient and supports the theoretical results.

MMMay 20, 2015
Compressive Sensing of Large-Scale Images: An Assumption-Free Approach

Wei-Jie Liang, Gang-Xuan Lin, Chun-Shien Lu

Cost-efficient compressive sensing of big media data with fast reconstructed high-quality results is very challenging. In this paper, we propose a new large-scale image compressive sensing method, composed of operator-based strategy in the context of fixed point continuation method and weighted LASSO with tree structure sparsity pattern. The main characteristic of our method is free from any assumptions and restrictions. The feasibility of our method is verified via simulations and comparisons with state-of-the-art algorithms.

SYMar 28, 2015
The Necessary And Sufficient Condition for Generalized Demixing

Chun-Yen Kuo, Gang-Xuan Lin, Chun-Shien Lu

Demixing is the problem of identifying multiple structured signals from a superimposed observation. This work analyzes a general framework, based on convex optimization, for solving demixing problems. We present a new solution to determine whether or not a specific convex optimization problem built for generalized demixing is successful. This solution will also bring about the possibility to estimate the probability of success by the approximate kinematic formula.