90.3CVApr 13Code
NTIRE 2026 Challenge on Robust AI-Generated Image Detection in the WildAleksandr Gushchin, Khaled Abud, Ekaterina Shumitskaya et al.
This paper presents an overview of the NTIRE 2026 Challenge on Robust AI-Generated Image Detection in the Wild, held in conjunction with the NTIRE workshop at CVPR 2026. The goal of this challenge was to develop detection models capable of distinguishing real images from generated ones in realistic scenarios: the images are often transformed (cropped, resized, compressed, blurred) for practical usage, and therefore, the detection models should be robust to such transformations. The challenge is based on a novel dataset consisting of 108,750 real and 185,750 AI-generated images from 42 generators comprising a large variety of open-source and closed-source models of various architectures, augmented with 36 image transformations. Methods were evaluated using ROC AUC on the full test set, including both transformed and untransformed images. A total of 511 participants registered, with 20 teams submitting valid final solutions. This report provides a comprehensive overview of the challenge, describes the proposed solutions, and can be used as a valuable reference for researchers and practitioners in increasing the robustness of the detection models to real-world transformations.
CVJun 12, 2022
STD-NET: Search of Image Steganalytic Deep-learning Architecture via Hierarchical Tensor DecompositionShunquan Tan, Qiushi Li, Laiyuan Li et al.
Recent studies shows that the majority of existing deep steganalysis models have a large amount of redundancy, which leads to a huge waste of storage and computing resources. The existing model compression method cannot flexibly compress the convolutional layer in residual shortcut block so that a satisfactory shrinking rate cannot be obtained. In this paper, we propose STD-NET, an unsupervised deep-learning architecture search approach via hierarchical tensor decomposition for image steganalysis. Our proposed strategy will not be restricted by various residual connections, since this strategy does not change the number of input and output channels of the convolution block. We propose a normalized distortion threshold to evaluate the sensitivity of each involved convolutional layer of the base model to guide STD-NET to compress target network in an efficient and unsupervised approach, and obtain two network structures of different shapes with low computation cost and similar performance compared with the original one. Extensive experiments have confirmed that, on one hand, our model can achieve comparable or even better detection performance in various steganalytic scenarios due to the great adaptivity of the obtained network architecture. On the other hand, the experimental results also demonstrate that our proposed strategy is more efficient and can remove more redundancy compared with previous steganalytic network compression methods.
CVOct 16, 2023
Evading Detection Actively: Toward Anti-Forensics against Forgery LocalizationLong Zhuo, Shenghai Luo, Shunquan Tan et al.
Anti-forensics seeks to eliminate or conceal traces of tampering artifacts. Typically, anti-forensic methods are designed to deceive binary detectors and persuade them to misjudge the authenticity of an image. However, to the best of our knowledge, no attempts have been made to deceive forgery detectors at the pixel level and mis-locate forged regions. Traditional adversarial attack methods cannot be directly used against forgery localization due to the following defects: 1) they tend to just naively induce the target forensic models to flip their pixel-level pristine or forged decisions; 2) their anti-forensics performance tends to be severely degraded when faced with the unseen forensic models; 3) they lose validity once the target forensic models are retrained with the anti-forensics images generated by them. To tackle the three defects, we propose SEAR (Self-supErvised Anti-foRensics), a novel self-supervised and adversarial training algorithm that effectively trains deep-learning anti-forensic models against forgery localization. SEAR sets a pretext task to reconstruct perturbation for self-supervised learning. In adversarial training, SEAR employs a forgery localization model as a supervisor to explore tampering features and constructs a deep-learning concealer to erase corresponding traces. We have conducted largescale experiments across diverse datasets. The experimental results demonstrate that, through the combination of self-supervised learning and adversarial learning, SEAR successfully deceives the state-of-the-art forgery localization methods, as well as tackle the three defects regarding traditional adversarial attack methods mentioned above.
CRDec 13, 2023Code
Prompt Engineering-assisted Malware Dynamic Analysis Using GPT-4Pei Yan, Shunquan Tan, Miaohui Wang et al.
Dynamic analysis methods effectively identify shelled, wrapped, or obfuscated malware, thereby preventing them from invading computers. As a significant representation of dynamic malware behavior, the API (Application Programming Interface) sequence, comprised of consecutive API calls, has progressively become the dominant feature of dynamic analysis methods. Though there have been numerous deep learning models for malware detection based on API sequences, the quality of API call representations produced by those models is limited. These models cannot generate representations for unknown API calls, which weakens both the detection performance and the generalization. Further, the concept drift phenomenon of API calls is prominent. To tackle these issues, we introduce a prompt engineering-assisted malware dynamic analysis using GPT-4. In this method, GPT-4 is employed to create explanatory text for each API call within the API sequence. Afterward, the pre-trained language model BERT is used to obtain the representation of the text, from which we derive the representation of the API sequence. Theoretically, this proposed method is capable of generating representations for all API calls, excluding the necessity for dataset training during the generation process. Utilizing the representation, a CNN-based detection model is designed to extract the feature. We adopt five benchmark datasets to validate the performance of the proposed model. The experimental results reveal that the proposed detection algorithm performs better than the state-of-the-art method (TextCNN). Specifically, in cross-database experiments and few-shot learning experiments, the proposed model achieves excellent detection performance and almost a 100% recall rate for malware, verifying its superior generalization performance. The code is available at: github.com/yan-scnu/Prompted_Dynamic_Detection.
CVFeb 6
Universal Anti-forensics Attack against Image Forgery Detection via Multi-modal GuidanceHaipeng Li, Rongxuan Peng, Anwei Luo et al.
The rapid advancement of AI-Generated Content (AIGC) technologies poses significant challenges for authenticity assessment. However, existing evaluation protocols largely overlook anti-forensics attack, failing to ensure the comprehensive robustness of state-of-the-art AIGC detectors in real-world applications. To bridge this gap, we propose ForgeryEraser, a framework designed to execute universal anti-forensics attack without access to the target AIGC detectors. We reveal an adversarial vulnerability stemming from the systemic reliance on Vision-Language Models (VLMs) as shared backbones (e.g., CLIP), where downstream AIGC detectors inherit the feature space of these publicly accessible models. Instead of traditional logit-based optimization, we design a multi-modal guidance loss to drive forged image embeddings within the VLM feature space toward text-derived authentic anchors to erase forgery traces, while repelling them from forgery anchors. Extensive experiments demonstrate that ForgeryEraser causes substantial performance degradation to advanced AIGC detectors on both global synthesis and local editing benchmarks. Moreover, ForgeryEraser induces explainable forensic models to generate explanations consistent with authentic images for forged images. Our code will be made publicly available.
CVAug 10, 2025Code
CLUE: Leveraging Low-Rank Adaptation to Capture Latent Uncovered Evidence for Image Forgery LocalizationYouqi Wang, Shunquan Tan, Rongxuan Peng et al.
The increasing accessibility of image editing tools and generative AI has led to a proliferation of visually convincing forgeries, compromising the authenticity of digital media. In this paper, in addition to leveraging distortions from conventional forgeries, we repurpose the mechanism of a state-of-the-art (SOTA) text-to-image synthesis model by exploiting its internal generative process, turning it into a high-fidelity forgery localization tool. To this end, we propose CLUE (Capture Latent Uncovered Evidence), a framework that employs Low- Rank Adaptation (LoRA) to parameter-efficiently reconfigure Stable Diffusion 3 (SD3) as a forensic feature extractor. Our approach begins with the strategic use of SD3's Rectified Flow (RF) mechanism to inject noise at varying intensities into the latent representation, thereby steering the LoRAtuned denoising process to amplify subtle statistical inconsistencies indicative of a forgery. To complement the latent analysis with high-level semantic context and precise spatial details, our method incorporates contextual features from the image encoder of the Segment Anything Model (SAM), which is parameter-efficiently adapted to better trace the boundaries of forged regions. Extensive evaluations demonstrate CLUE's SOTA generalization performance, significantly outperforming prior methods. Furthermore, CLUE shows superior robustness against common post-processing attacks and Online Social Networks (OSNs). Code is publicly available at https://github.com/SZAISEC/CLUE.
CVAug 10, 2025Code
ForensicsSAM: Toward Robust and Unified Image Forgery Detection and Localization Resisting to Adversarial AttackRongxuan Peng, Shunquan Tan, Chenqi Kong et al.
Parameter-efficient fine-tuning (PEFT) has emerged as a popular strategy for adapting large vision foundation models, such as the Segment Anything Model (SAM) and LLaVA, to downstream tasks like image forgery detection and localization (IFDL). However, existing PEFT-based approaches overlook their vulnerability to adversarial attacks. In this paper, we show that highly transferable adversarial images can be crafted solely via the upstream model, without accessing the downstream model or training data, significantly degrading the IFDL performance. To address this, we propose ForensicsSAM, a unified IFDL framework with built-in adversarial robustness. Our design is guided by three key ideas: (1) To compensate for the lack of forgery-relevant knowledge in the frozen image encoder, we inject forgery experts into each transformer block to enhance its ability to capture forgery artifacts. These forgery experts are always activated and shared across any input images. (2) To detect adversarial images, we design an light-weight adversary detector that learns to capture structured, task-specific artifact in RGB domain, enabling reliable discrimination across various attack methods. (3) To resist adversarial attacks, we inject adversary experts into the global attention layers and MLP modules to progressively correct feature shifts induced by adversarial noise. These adversary experts are adaptively activated by the adversary detector, thereby avoiding unnecessary interference with clean images. Extensive experiments across multiple benchmarks demonstrate that ForensicsSAM achieves superior resistance to various adversarial attack methods, while also delivering state-of-the-art performance in image-level forgery detection and pixel-level forgery localization. The resource is available at https://github.com/siriusPRX/ForensicsSAM.
CVJun 15, 2025
Active Adversarial Noise Suppression for Image Forgery LocalizationRongxuan Peng, Shunquan Tan, Xianbo Mo et al.
Recent advances in deep learning have significantly propelled the development of image forgery localization. However, existing models remain highly vulnerable to adversarial attacks: imperceptible noise added to forged images can severely mislead these models. In this paper, we address this challenge with an Adversarial Noise Suppression Module (ANSM) that generate a defensive perturbation to suppress the attack effect of adversarial noise. We observe that forgery-relevant features extracted from adversarial and original forged images exhibit distinct distributions. To bridge this gap, we introduce Forgery-relevant Features Alignment (FFA) as a first-stage training strategy, which reduces distributional discrepancies by minimizing the channel-wise Kullback-Leibler divergence between these features. To further refine the defensive perturbation, we design a second-stage training strategy, termed Mask-guided Refinement (MgR), which incorporates a dual-mask constraint. MgR ensures that the perturbation remains effective for both adversarial and original forged images, recovering forgery localization accuracy to their original level. Extensive experiments across various attack algorithms demonstrate that our method significantly restores the forgery localization model's performance on adversarial images. Notably, when ANSM is applied to original forged images, the performance remains nearly unaffected. To our best knowledge, this is the first report of adversarial defense in image forgery localization tasks. We have released the source code and anti-forensics dataset.
CVFeb 15
ForgeryVCR: Visual-Centric Reasoning via Efficient Forensic Tools in MLLMs for Image Forgery Detection and LocalizationYouqi Wang, Shen Chen, Haowei Wang et al.
Existing Multimodal Large Language Models (MLLMs) for image forgery detection and localization predominantly operate under a text-centric Chain-of-Thought (CoT) paradigm. However, forcing these models to textually characterize imperceptible low-level tampering traces inevitably leads to hallucinations, as linguistic modalities are insufficient to capture such fine-grained pixel-level inconsistencies. To overcome this, we propose ForgeryVCR, a framework that incorporates a forensic toolbox to materialize imperceptible traces into explicit visual intermediates via Visual-Centric Reasoning. To enable efficient tool utilization, we introduce a Strategic Tool Learning post-training paradigm, encompassing gain-driven trajectory construction for Supervised Fine-Tuning (SFT) and subsequent Reinforcement Learning (RL) optimization guided by a tool utility reward. This paradigm empowers the MLLM to act as a proactive decision-maker, learning to spontaneously invoke multi-view reasoning paths including local zoom-in for fine-grained inspection and the analysis of invisible inconsistencies in compression history, noise residuals, and frequency domains. Extensive experiments reveal that ForgeryVCR achieves state-of-the-art (SOTA) performance in both detection and localization tasks, demonstrating superior generalization and robustness with minimal tool redundancy. The project page is available at https://youqiwong.github.io/projects/ForgeryVCR/.
CVAug 27, 2025
SDiFL: Stable Diffusion-Driven Framework for Image Forgery LocalizationYang Su, Shunquan Tan, Jiwu Huang
Driven by the new generation of multi-modal large models, such as Stable Diffusion (SD), image manipulation technologies have advanced rapidly, posing significant challenges to image forensics. However, existing image forgery localization methods, which heavily rely on labor-intensive and costly annotated data, are struggling to keep pace with these emerging image manipulation technologies. To address these challenges, we are the first to integrate both image generation and powerful perceptual capabilities of SD into an image forensic framework, enabling more efficient and accurate forgery localization. First, we theoretically show that the multi-modal architecture of SD can be conditioned on forgery-related information, enabling the model to inherently output forgery localization results. Then, building on this foundation, we specifically leverage the multimodal framework of Stable DiffusionV3 (SD3) to enhance forgery localization performance.We leverage the multi-modal processing capabilities of SD3 in the latent space by treating image forgery residuals -- high-frequency signals extracted using specific highpass filters -- as an explicit modality. This modality is fused into the latent space during training to enhance forgery localization performance. Notably, our method fully preserves the latent features extracted by SD3, thereby retaining the rich semantic information of the input image. Experimental results show that our framework achieves up to 12% improvements in performance on widely used benchmarking datasets compared to current state-of-the-art image forgery localization models. Encouragingly, the model demonstrates strong performance on forensic tasks involving real-world document forgery images and natural scene forging images, even when such data were entirely unseen during training.
CVNov 24, 2021
Universal Deep Network for Steganalysis of Color Image based on Channel RepresentationKangkang Wei, Weiqi Luo, Shunquan Tan et al.
Up to now, most existing steganalytic methods are designed for grayscale images, and they are not suitable for color images that are widely used in current social networks. In this paper, we design a universal color image steganalysis network (called UCNet) in spatial and JPEG domains. The proposed method includes preprocessing, convolutional, and classification modules. To preserve the steganographic artifacts in each color channel, in preprocessing module, we firstly separate the input image into three channels according to the corresponding embedding spaces (i.e. RGB for spatial steganography and YCbCr for JPEG steganography), and then extract the image residuals with 62 fixed high-pass filters, finally concatenate all truncated residuals for subsequent analysis rather than adding them together with normal convolution like existing CNN-based steganalyzers. To accelerate the network convergence and effectively reduce the number of parameters, in convolutional module, we carefully design three types of layers with different shortcut connections and group convolution structures to further learn high-level steganalytic features. In classification module, we employ a global average pooling and fully connected layer for classification. We conduct extensive experiments on ALASKA II to demonstrate that the proposed method can achieve state-of-the-art results compared with the modern CNN-based steganalyzers (e.g., SRNet and J-YeNet) in both spatial and JPEG domains, while keeping relatively few memory requirements and training time. Furthermore, we also provide necessary descriptions and many ablation experiments to verify the rationality of the network design.
CVJul 6, 2021
Self-Adversarial Training incorporating Forgery Attention for Image Forgery LocalizationLong Zhuo, Shunquan Tan, Bin Li et al.
Image editing techniques enable people to modify the content of an image without leaving visual traces and thus may cause serious security risks. Hence the detection and localization of these forgeries become quite necessary and challenging. Furthermore, unlike other tasks with extensive data, there is usually a lack of annotated forged images for training due to annotation difficulties. In this paper, we propose a self-adversarial training strategy and a reliable coarse-to-fine network that utilizes a self-attention mechanism to localize forged regions in forgery images. The self-attention module is based on a Channel-Wise High Pass Filter block (CW-HPF). CW-HPF leverages inter-channel relationships of features and extracts noise features by high pass filters. Based on the CW-HPF, a self-attention mechanism, called forgery attention, is proposed to capture rich contextual dependencies of intrinsic inconsistency extracted from tampered regions. Specifically, we append two types of attention modules on top of CW-HPF respectively to model internal interdependencies in spatial dimension and external dependencies among channels. We exploit a coarse-to-fine network to enhance the noise inconsistency between original and tampered regions. More importantly, to address the issue of insufficient training data, we design a self-adversarial training strategy that expands training data dynamically to achieve more robust performance. Specifically, in each training iteration, we perform adversarial attacks against our network to generate adversarial examples and train our model on them. Extensive experimental results demonstrate that our proposed algorithm steadily outperforms state-of-the-art methods by a clear margin in different benchmark datasets.
MMMar 25, 2021
MCTSteg: A Monte Carlo Tree Search-based Reinforcement Learning Framework for Universal Non-additive SteganographyXianbo Mo, Shunquan Tan, Bin Li et al.
Recent research has shown that non-additive image steganographic frameworks effectively improve security performance through adjusting distortion distribution. However, as far as we know, all of the existing non-additive proposals are based on handcrafted policies, and can only be applied to a specific image domain, which heavily prevent non-additive steganography from releasing its full potentiality. In this paper, we propose an automatic non-additive steganographic distortion learning framework called MCTSteg to remove the above restrictions. Guided by the reinforcement learning paradigm, we combine Monte Carlo Tree Search (MCTS) and steganalyzer-based environmental model to build MCTSteg. MCTS makes sequential decisions to adjust distortion distribution without human intervention. Our proposed environmental model is used to obtain feedbacks from each decision. Due to its self-learning characteristic and domain-independent reward function, MCTSteg has become the first reported universal non-additive steganographic framework which can work in both spatial and JPEG domains. Extensive experimental results show that MCTSteg can effectively withstand the detection of both hand-crafted feature-based and deep-learning-based steganalyzers. In both spatial and JPEG domains, the security performance of MCTSteg steadily outperforms the state of the art by a clear margin under different scenarios.
CVJan 13, 2021
Image Steganography based on Iteratively Adversarial Samples of A Synchronized-directions Sub-imageXinghong Qin, Shunquan Tan, Bin Li et al.
Nowadays a steganography has to face challenges of both feature based staganalysis and convolutional neural network (CNN) based steganalysis. In this paper, we present a novel steganography scheme denoted as ITE-SYN (based on ITEratively adversarial perturbations onto a SYNchronized-directions sub-image), by which security data is embedded with synchronizing modification directions to enhance security and then iteratively increased perturbations are added onto a sub-image to reduce loss with cover class label of the target CNN classifier. Firstly an exist steganographic function is employed to compute initial costs. Then the cover image is decomposed into some non-overlapped sub-images. After each sub-image is embedded, costs will be adjusted following clustering modification directions profile. And then the next sub-image will be embedded with adjusted costs until all secret data has been embedded. If the target CNN classifier does not discriminate the stego image as a cover image, based on adjusted costs, we change costs with adversarial manners according to signs of gradients back-propagated from the CNN classifier. And then a sub-image is chosen to be re-embedded with changed costs. Adversarial intensity will be iteratively increased until the adversarial stego image can fool the target CNN classifier. Experiments demonstrate that the proposed method effectively enhances security to counter both conventional feature-based classifiers and CNN classifiers, even other non-target CNN classifiers.
MMNov 12, 2019
CALPA-NET: Channel-pruning-assisted Deep Residual Network for Steganalysis of Digital ImagesShunquan Tan, Weilong Wu, Zilong Shao et al.
Over the past few years, detection performance improvements of deep-learning based steganalyzers have been usually achieved through structure expansion. However, excessive expanded structure results in huge computational cost, storage overheads, and consequently difficulty in training and deployment. In this paper we propose CALPA-NET, a ChAnneL-Pruning-Assisted deep residual network architecture search approach to shrink the network structure of existing vast, over-parameterized deep-learning based steganalyzers. We observe that the broad inverted-pyramid structure of existing deep-learning based steganalyzers might contradict the well-established model diversity oriented philosophy, and therefore is not suitable for steganalysis. Then a hybrid criterion combined with two network pruning schemes is introduced to adaptively shrink every involved convolutional layer in a data-driven manner. The resulting network architecture presents a slender bottleneck-like structure. We have conducted extensive experiments on BOSSBase+BOWS2 dataset, more diverse ALASKA dataset and even a large-scale subset extracted from ImageNet CLS-LOC dataset. The experimental results show that the model structure generated by our proposed CALPA-NET can achieve comparative performance with less than two percent of parameters and about one third FLOPs compared to the original steganalytic model. The new model possesses even better adaptivity, transferability, and scalability.
MMAug 22, 2018
Identification of Deep Network Generated Images Using Disparities in Color ComponentsHaodong Li, Bin Li, Shunquan Tan et al.
With the powerful deep network architectures, such as generative adversarial networks, one can easily generate photorealistic images. Although the generated images are not dedicated for fooling human or deceiving biometric authentication systems, research communities and public media have shown great concerns on the security issues caused by these images. This paper addresses the problem of identifying deep network generated (DNG) images. Taking the differences between camera imaging and DNG image generation into considerations, we analyze the disparities between DNG images and real images in different color components. We observe that the DNG images are more distinguishable from real ones in the chrominance components, especially in the residual domain. Based on these observations, we propose a feature set to capture color image statistics for identifying DNG images. Additionally, we evaluate several detection situations, including the training-testing data are matched or mismatched in image sources or generative models and detection with only real images. Extensive experimental results show that the proposed method can accurately identify DNG images and outperforms existing methods when the training and testing data are mismatched. Moreover, when the GAN model is unknown, our methods also achieves good performance with one-class classification by using only real images for training.
MMMar 24, 2018
CNN Based Adversarial Embedding with Minimum Alteration for Image SteganographyWeixuan Tang, Bin Li, Shunquan Tan et al.
Historically, steganographic schemes were designed in a way to preserve image statistics or steganalytic features. Since most of the state-of-the-art steganalytic methods employ a machine learning (ML) based classifier, it is reasonable to consider countering steganalysis by trying to fool the ML classifiers. However, simply applying perturbations on stego images as adversarial examples may lead to the failure of data extraction and introduce unexpected artefacts detectable by other classifiers. In this paper, we present a steganographic scheme with a novel operation called adversarial embedding, which achieves the goal of hiding a stego message while at the same time fooling a convolutional neural network (CNN) based steganalyzer. The proposed method works under the conventional framework of distortion minimization. Adversarial embedding is achieved by adjusting the costs of image element modifications according to the gradients backpropagated from the CNN classifier targeted by the attack. Therefore, modification direction has a higher probability to be the same as the sign of the gradient. In this way, the so called adversarial stego images are generated. Experiments demonstrate that the proposed steganographic scheme is secure against the targeted adversary-unaware steganalyzer. In addition, it deteriorates the performance of other adversary-aware steganalyzers opening the way to a new class of modern steganographic schemes capable to overcome powerful CNN-based steganalysis.
MMMar 13, 2018
WISERNet: Wider Separate-then-reunion Network for Steganalysis of Color ImagesJishen Zeng, Shunquan Tan, Guangqing Liu et al.
Until recently, deep steganalyzers in spatial domain have been all designed for gray-scale images. In this paper, we propose WISERNet (the wider separate-then-reunion network) for steganalysis of color images. We provide theoretical rationale to claim that the summation in normal convolution is one sort of "linear collusion attack" which reserves strong correlated patterns while impairs uncorrelated noises. Therefore in the bottom convolutional layer which aims at suppressing correlated image contents, we adopt separate channel-wise convolution without summation instead. Conversely, in the upper convolutional layers we believe that the summation in normal convolution is beneficial. Therefore we adopt united normal convolution in those layers and make them remarkably wider to reinforce the effect of "linear collusion attack". As a result, our proposed wide-and-shallow, separate-then-reunion network structure is specifically suitable for color image steganalysis. We have conducted extensive experiments on color image datasets generated from BOSSBase raw images and another large-scale dataset which contains 100,000 raw images, with different demosaicking algorithms and down-sampling algorithms. The experimental results show that our proposed network outperforms other state-of-the-art color image steganalytic models either hand-crafted or learned using deep networks in the literature by a clear margin. Specifically, it is noted that the detection performance gain is achieved with less than half the complexity compared to the most advanced deep-learning steganalyzer as far as we know, which is scarce in the literature.
CVOct 16, 2017
A multi-branch convolutional neural network for detecting double JPEG compressionBin Li, Hu Luo, Haoxin Zhang et al.
Detection of double JPEG compression is important to forensics analysis. A few methods were proposed based on convolutional neural networks (CNNs). These methods only accept inputs from pre-processed data, such as histogram features and/or decompressed images. In this paper, we present a CNN solution by using raw DCT (discrete cosine transformation) coefficients from JPEG images as input. Considering the DCT sub-band nature in JPEG, a multiple-branch CNN structure has been designed to reveal whether a JPEG format image has been doubly compressed. Comparing with previous methods, the proposed method provides end-to-end detection capability. Extensive experiments have been carried out to demonstrate the effectiveness of the proposed network.
MMNov 10, 2016
Large-scale JPEG steganalysis using hybrid deep-learning frameworkJishen Zeng, Shunquan Tan, Bin Li et al.
Adoption of deep learning in image steganalysis is still in its initial stage. In this paper we propose a generic hybrid deep-learning framework for JPEG steganalysis incorporating the domain knowledge behind rich steganalytic models. Our proposed framework involves two main stages. The first stage is hand-crafted, corresponding to the convolution phase and the quantization & truncation phase of the rich models. The second stage is a compound deep neural network containing multiple deep subnets in which the model parameters are learned in the training procedure. We provided experimental evidences and theoretical reflections to argue that the introduction of threshold quantizers, though disable the gradient-descent-based learning of the bottom convolution phase, is indeed cost-effective. We have conducted extensive experiments on a large-scale dataset extracted from ImageNet. The primary dataset used in our experiments contains 500,000 cover images, while our largest dataset contains five million cover images. Our experiments show that the integration of quantization and truncation into deep-learning steganalyzers do boost the detection performance by a clear margin. Furthermore, we demonstrate that our framework is insensitive to JPEG blocking artifact alterations, and the learned model can be easily transferred to a different attacking target and even a different dataset. These properties are of critical importance in practical applications.
MMMay 29, 2014
JPEG Noises beyond the First Compression CycleBin Li, Tian-Tsong Ng, Xiaolong Li et al.
This paper focuses on the JPEG noises, which include the quantization noise and the rounding noise, during a JPEG compression cycle. The JPEG noises in the first compression cycle have been well studied; however, so far less attention has been paid on the JPEG noises in higher compression cycles. In this work, we present a statistical analysis on JPEG noises beyond the first compression cycle. To our knowledge, this is the first work on this topic. We find that the noise distributions in higher compression cycles are different from those in the first compression cycle, and they are dependent on the quantization parameters used between two successive cycles. To demonstrate the benefits from the statistical analysis, we provide two applications that can employ the derived noise distributions to uncover JPEG compression history with state-of-the-art performance.