CRAIJun 7, 2024

A Survey of Fragile Model Watermarking

arXiv:2406.04809v53 citations
Originality Synthesis-oriented
AI Analysis

It provides a comprehensive overview for researchers working on detecting model tampering, but is incremental as a survey paper.

This paper surveys the field of model fragile watermarking, which detects tampering like backdoors or poisoning in neural networks to mitigate risks such as misidentification in autonomous driving, and categorizes existing work to outline the field's development.

Model fragile watermarking, inspired by both the field of adversarial attacks on neural networks and traditional multimedia fragile watermarking, has gradually emerged as a potent tool for detecting tampering, and has witnessed rapid development in recent years. Unlike robust watermarks, which are widely used for identifying model copyrights, fragile watermarks for models are designed to identify whether models have been subjected to unexpected alterations such as backdoors, poisoning, compression, among others. These alterations can pose unknown risks to model users, such as misidentifying stop signs as speed limit signs in classic autonomous driving scenarios. This paper provides an overview of the relevant work in the field of model fragile watermarking since its inception, categorizing them and revealing the developmental trajectory of the field, thus offering a comprehensive survey for future endeavors in model fragile watermarking.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes