TS-RNN: Text Steganalysis Based on Recurrent Neural Networks
This addresses the challenge of text steganalysis for security applications, presenting an incremental improvement by applying RNNs to a known bottleneck in detecting steganographic texts.
The paper tackles the problem of detecting hidden information in automatically generated steganographic texts by observing distortions in word probability distributions, using Recurrent Neural Networks (RNNs) to classify texts and estimate hidden information amounts with high detection accuracy.
With the rapid development of natural language processing technologies, more and more text steganographic methods based on automatic text generation technology have appeared in recent years. These models use the powerful self-learning and feature extraction ability of the neural networks to learn the feature expression of massive normal texts. Then they can automatically generate dense steganographic texts which conform to such statistical distribution based on the learned statistical patterns. In this paper, we observe that the conditional probability distribution of each word in the automatically generated steganographic texts will be distorted after embedded with hidden information. We use Recurrent Neural Networks (RNNs) to extract these feature distribution differences and then classify those features into cover text and stego text categories. Experimental results show that the proposed model can achieve high detection accuracy. Besides, the proposed model can even make use of the subtle differences of the feature distribution of texts to estimate the amount of hidden information embedded in the generated steganographic text.