RIGA: Covert and Robust White-Box Watermarking of Deep Neural Networks
This addresses the need for data owners to securely trace DNNs without compromising performance, representing a domain-specific advancement in watermarking techniques.
The paper tackles the problem of tracing deep neural networks after release by proposing RIGA, a white-box watermarking algorithm that uses adversarial training, which does not impact model accuracy and significantly improves covertness and robustness over state-of-the-art methods.
Watermarking of deep neural networks (DNN) can enable their tracing once released by a data owner. In this paper, we generalize white-box watermarking algorithms for DNNs, where the data owner needs white-box access to the model to extract the watermark. White-box watermarking algorithms have the advantage that they do not impact the accuracy of the watermarked model. We propose Robust whIte-box GAn watermarking (RIGA), a novel white-box watermarking algorithm that uses adversarial training. Our extensive experiments demonstrate that the proposed watermarking algorithm not only does not impact accuracy, but also significantly improves the covertness and robustness over the current state-of-art.