Addressing Segmentation Ambiguity in Neural Linguistic Steganography
This addresses a critical issue for secure communication in steganography, but it is incremental as it builds on prior work to fix a specific oversight.
The paper tackles the problem of segmentation ambiguity in neural linguistic steganography, which causes decoding failures, and proposes simple tricks to overcome it, applicable to any language including those without explicit word boundaries.
Previous studies on neural linguistic steganography, except Ueoka et al. (2021), overlook the fact that the sender must detokenize cover texts to avoid arousing the eavesdropper's suspicion. In this paper, we demonstrate that segmentation ambiguity indeed causes occasional decoding failures at the receiver's side. With the near-ubiquity of subwords, this problem now affects any language. We propose simple tricks to overcome this problem, which are even applicable to languages without explicit word boundaries.