CRCVLGJul 27, 2022

DynaMarks: Defending Against Deep Learning Model Extraction Using Dynamic Watermarking

arXiv:2207.13321v110 citationsh-index: 15
Originality Incremental advance
AI Analysis

This addresses the problem of intellectual property protection for deep learning model owners against theft in black-box settings, representing an incremental improvement over existing watermarking methods.

The paper tackles model extraction attacks on deep learning models by proposing DynaMarks, a dynamic watermarking technique that embeds watermarks into surrogate models without altering the original model's training, achieving effective watermarking while preserving model accuracies on datasets like Fashion MNIST, CIFAR-10, and ImageNet.

The functionality of a deep learning (DL) model can be stolen via model extraction where an attacker obtains a surrogate model by utilizing the responses from a prediction API of the original model. In this work, we propose a novel watermarking technique called DynaMarks to protect the intellectual property (IP) of DL models against such model extraction attacks in a black-box setting. Unlike existing approaches, DynaMarks does not alter the training process of the original model but rather embeds watermark into a surrogate model by dynamically changing the output responses from the original model prediction API based on certain secret parameters at inference runtime. The experimental outcomes on Fashion MNIST, CIFAR-10, and ImageNet datasets demonstrate the efficacy of DynaMarks scheme to watermark surrogate models while preserving the accuracies of the original models deployed in edge devices. In addition, we also perform experiments to evaluate the robustness of DynaMarks against various watermark removal strategies, thus allowing a DL model owner to reliably prove model ownership.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes