ObfuNAS: A Neural Architecture Search-based DNN Obfuscation Approach
This addresses a security problem for DNN owners by mitigating an under-explored vulnerability in architecture obfuscation, though it is incremental as it builds on existing obfuscation and NAS techniques.
The paper tackles the vulnerability of DNN architecture obfuscation to adversarial retraining by proposing ObfuNAS, which uses neural architecture search to find obfuscated architectures that degrade attacker accuracy by up to 2.6% with minimal computational overhead.
Malicious architecture extraction has been emerging as a crucial concern for deep neural network (DNN) security. As a defense, architecture obfuscation is proposed to remap the victim DNN to a different architecture. Nonetheless, we observe that, with only extracting an obfuscated DNN architecture, the adversary can still retrain a substitute model with high performance (e.g., accuracy), rendering the obfuscation techniques ineffective. To mitigate this under-explored vulnerability, we propose ObfuNAS, which converts the DNN architecture obfuscation into a neural architecture search (NAS) problem. Using a combination of function-preserving obfuscation strategies, ObfuNAS ensures that the obfuscated DNN architecture can only achieve lower accuracy than the victim. We validate the performance of ObfuNAS with open-source architecture datasets like NAS-Bench-101 and NAS-Bench-301. The experimental results demonstrate that ObfuNAS can successfully find the optimal mask for a victim model within a given FLOPs constraint, leading up to 2.6% inference accuracy degradation for attackers with only 0.14x FLOPs overhead. The code is available at: https://github.com/Tongzhou0101/ObfuNAS.