LGCRJul 16, 2022

Certified Neural Network Watermarks with Randomized Smoothing

arXiv:2207.07972v167 citationsh-index: 72Has Code
Originality Incremental advance
AI Analysis

This addresses the need for reliable intellectual property protection in AI models, offering a certified solution against removal attacks, though it builds incrementally on existing randomized smoothing techniques.

The paper tackles the problem of watermark removal in deep learning models by adversaries, proposing a certifiable watermarking method that guarantees unremovability unless model parameters are altered beyond a specific l2 threshold, and shows empirical robustness improvements over previous methods.

Watermarking is a commonly used strategy to protect creators' rights to digital images, videos and audio. Recently, watermarking methods have been extended to deep learning models -- in principle, the watermark should be preserved when an adversary tries to copy the model. However, in practice, watermarks can often be removed by an intelligent adversary. Several papers have proposed watermarking methods that claim to be empirically resistant to different types of removal attacks, but these new techniques often fail in the face of new or better-tuned adversaries. In this paper, we propose a certifiable watermarking method. Using the randomized smoothing technique proposed in Chiang et al., we show that our watermark is guaranteed to be unremovable unless the model parameters are changed by more than a certain l2 threshold. In addition to being certifiable, our watermark is also empirically more robust compared to previous watermarking methods. Our experiments can be reproduced with code at https://github.com/arpitbansal297/Certified_Watermarks

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes