Frustratingly Easy Edit-based Linguistic Steganography with a Masked Language Model
This work addresses the problem of secure and efficient text steganography for applications requiring covert communication, representing an incremental improvement by adapting existing models to a known bottleneck.
The paper tackles the challenge of generating genuine-looking texts in linguistic steganography by revisiting edit-based approaches using a masked language model, resulting in a method with high payload capacity, improved security against detection compared to generation-based methods, and better control over the security/payload trade-off.
With advances in neural language models, the focus of linguistic steganography has shifted from edit-based approaches to generation-based ones. While the latter's payload capacity is impressive, generating genuine-looking texts remains challenging. In this paper, we revisit edit-based linguistic steganography, with the idea that a masked language model offers an off-the-shelf solution. The proposed method eliminates painstaking rule construction and has a high payload capacity for an edit-based model. It is also shown to be more secure against automatic detection than a generation-based method while offering better control of the security/payload capacity trade-off.