On Hiding Neural Networks Inside Neural Networks
This introduces a novel steganographic technique that could be exploited by adversaries, posing a security risk in AI systems.
The paper tackles the problem of embedding secret machine learning models within trained neural networks by exploiting excess capacity, proving detection is computationally infeasible and showing the carrier network does not compromise the disguise.
Modern neural networks often contain significantly more parameters than the size of their training data. We show that this excess capacity provides an opportunity for embedding secret machine learning models within a trained neural network. Our novel framework hides the existence of a secret neural network with arbitrary desired functionality within a carrier network. We prove theoretically that the secret network's detection is computationally infeasible and demonstrate empirically that the carrier network does not compromise the secret network's disguise. Our paper introduces a previously unknown steganographic technique that can be exploited by adversaries if left unchecked.