CRJul 29, 2021
Subsequent embedding in targeted image steganalysis: Theoretical framework and practical applicationsDavid Megías, Daniel Lerch-Hostalot
Steganalysis is a collection of techniques used to detect whether secret information is embedded in a carrier using steganography. Most of the existing steganalytic methods are based on machine learning, which typically requires training a classifier with "laboratory" data. However, applying machine-learning classification to a new source of data is challenging, since there is typically a mismatch between the training and the testing sets. In addition, other sources of uncertainty affect the steganlytic process, including the mismatch between the targeted and the true steganographic algorithms, unknown parameters -- such as the message length -- and even having a mixture of several algorithms and parameters, which would constitute a realistic scenario. This paper presents subsequent embedding as a valuable strategy that can be incorporated into modern steganalysis. Although this solution has been applied in previous works, a theoretical basis for this strategy was missing. Here, we cover this research gap by introducing the "directionality" property of features with respect to data embedding. Once this strategy is sustained by a consistent theoretical framework, new practical applications are also described and tested against standard steganography, moving steganalysis closer to real-world conditions.
CRSep 23, 2019
Detection of Classifier Inconsistencies in Image SteganalysisDaniel Lerch-Hostalot, David Megías
In this paper, a methodology to detect inconsistencies in classification-based image steganalysis is presented. The proposed approach uses two classifiers: the usual one, trained with a set formed by cover and stego images, and a second classifier trained with the set obtained after embedding additional random messages into the original training set. When the decisions of these two classifiers are not consistent, we know that the prediction is not reliable. The number of inconsistencies in the predictions of a testing set may indicate that the classifier is not performing correctly in the testing scenario. This occurs, for example, in case of cover source mismatch, or when we are trying to detect a steganographic method that the classifier is no capable of modelling accurately. We also show how the number of inconsistencies can be used to predict the reliability of the classifier (classification errors).
MMMar 2, 2017
LSB Matching Steganalysis Based on Patterns of Pixel Differences and Random EmbeddingDaniel Lerch-Hostalot, David Megías
This paper presents a novel method for detection of LSB matching steganogra- phy in grayscale images. This method is based on the analysis of the differences between neighboring pixels before and after random data embedding. In natu- ral images, there is a strong correlation between adjacent pixels. This correla- tion is disturbed by LSB matching generating new types of correlations. The pre- sented method generates patterns from these correlations and analyzes their varia- tion when random data are hidden. The experiments performed for two different image databases show that the method yields better classification accuracy com- pared to prior art for both LSB matching and HUGO steganography. In addition, although the method is designed for the spatial domain, some experiments show its applicability also for detecting JPEG steganography.
MMMar 2, 2017
Unsupervised Steganalysis Based on Artificial Training SetsDaniel Lerch-Hostalot, David Megías
In this paper, an unsupervised steganalysis method that combines artificial training setsand supervised classification is proposed. We provide a formal framework for unsupervisedclassification of stego and cover images in the typical situation of targeted steganalysis (i.e.,for a known algorithm and approximate embedding bit rate). We also present a completeset of experiments using 1) eight different image databases, 2) image features based on RichModels, and 3) three different embedding algorithms: Least Significant Bit (LSB) matching,Highly undetectable steganography (HUGO) and Wavelet Obtained Weights (WOW). Weshow that the experimental results outperform previous methods based on Rich Models inthe majority of the tested cases. At the same time, the proposed approach bypasses theproblem of Cover Source Mismatch -when the embedding algorithm and bit rate are known-, since it removes the need of a training database when we have a large enough testing set.Furthermore, we provide a generic proof of the proposed framework in the machine learningcontext. Hence, the results of this paper could be extended to other classification problemssimilar to steganalysis.