Towards Robust Image-in-Audio Deep Steganography
This work addresses the problem of robust signal concealment in multimodal steganography for applications requiring secure data hiding, but it is incremental as it builds upon an existing method.
The paper tackled the challenge of improving robustness in image-in-audio deep steganography by enhancing an existing method with modifications to the loss function, STFT utilization, redundancy encoding, and pixel subconvolution buffering, resulting in outperformance over the existing method in robustness and perceptual transparency.
The field of steganography has experienced a surge of interest due to the recent advancements in AI-powered techniques, particularly in the context of multimodal setups that enable the concealment of signals within signals of a different nature. The primary objectives of all steganographic methods are to achieve perceptual transparency, robustness, and large embedding capacity - which often present conflicting goals that classical methods have struggled to reconcile. This paper extends and enhances an existing image-in-audio deep steganography method by focusing on improving its robustness. The proposed enhancements include modifications to the loss function, utilization of the Short-Time Fourier Transform (STFT), introduction of redundancy in the encoding process for error correction, and buffering of additional information in the pixel subconvolution operation. The results demonstrate that our approach outperforms the existing method in terms of robustness and perceptual transparency.