MMOct 29, 2018

Feature Bagging for Steganographer Identification

arXiv:1810.11973v12.31 citations

Originality Synthesis-oriented

AI Analysis

This addresses the challenge of identifying guilty users in multi-user steganography scenarios, though it appears incremental as it applies existing bagging techniques to a specific domain.

The paper tackles the steganographer identification problem (SIP) by proposing a feature bagging approach to improve detection accuracy in high-dimensional spaces, achieving significant improvements on a new dataset of 5108 images.

Traditional steganalysis algorithms focus on detecting the existence of steganography in a single object. In practice, one may face a complex scenario where one or some of multiple users also called actors are guilty of using steganography, which is defined as the steganographer identification problem (SIP). This requires steganalysis experts to design effective and robust detection algorithms to identify the guilty actor(s). The mainstream works use clustering, ensemble and anomaly detection, where distances in high dimensional space between features of actors are determined to find out the outlier(s) corresponding to steganographer(s). However, in high dimensional space, feature points could be sparse such that distances between feature points may become relatively similar to each other, which cannot benefit the detection. Moreover, it is well-known in machine learning that combining techniques such as boosting and bagging can be effective in improving detection performance. This motivates the authors in this paper to present a feature bagging approach to SIP. The proposed work merges results from multiple detection sub-models, each of which feature space is randomly sampled from the raw full dimensional space. We create a new dataset called ImgNetEase including 5108 images downloaded from a social website to mimic the real-world scenario. We extract PEV-274 features from images, and take nsF5 as the steganographic algorithm for evaluation. Experiments have shown that our work improves the detection accuracy significantly on created dataset in most cases, which has shown the superiority and applicability.

View on arXiv PDF

Similar