CR AIApr 11, 2024

Fragile Model Watermark for integrity protection: leveraging boundary volatility and sensitive sample-pairing

ZhenZhe Gao, Zhenjun Tang, Zhaoxia Yin, Baoyuan Wu, Yue Lu

arXiv:2404.07572v35.83 citationsh-index: 5ICME

Originality Highly original

AI Analysis

This addresses the need for integrity protection in deployed neural networks against malicious modifications like backdooring, offering a more sensitive and efficient watermarking method compared to prior work.

The paper tackles the problem of protecting neural networks from tampering by proposing a fragile model watermark that uses sample-pairing to place model boundaries between pairs and maximize logits, resulting in high sensitivity to modifications with Top-1 labels easily altering.

Neural networks have increasingly influenced people's lives. Ensuring the faithful deployment of neural networks as designed by their model owners is crucial, as they may be susceptible to various malicious or unintentional modifications, such as backdooring and poisoning attacks. Fragile model watermarks aim to prevent unexpected tampering that could lead DNN models to make incorrect decisions. They ensure the detection of any tampering with the model as sensitively as possible.However, prior watermarking methods suffered from inefficient sample generation and insufficient sensitivity, limiting their practical applicability. Our approach employs a sample-pairing technique, placing the model boundaries between pairs of samples, while simultaneously maximizing logits. This ensures that the model's decision results of sensitive samples change as much as possible and the Top-1 labels easily alter regardless of the direction it moves.

View on arXiv PDF

Similar