An Exact Poly-Time Membership-Queries Algorithm for Extraction a three-Layer ReLU Network
This work addresses model extraction attacks in machine learning by providing efficient algorithms for learning neural networks from queries, representing a significant advance over previous methods.
The authors tackled the problem of learning ReLU networks from queries, presenting polynomial-time algorithms for depth-two networks under mild general position assumptions and for a rich class of depth-three networks, improving upon prior results that required stronger assumptions or had no poly-time solutions for depth three.
We consider the natural problem of learning a ReLU network from queries, which was recently remotivated by model extraction attacks. In this work, we present a polynomial-time algorithm that can learn a depth-two ReLU network from queries under mild general position assumptions. We also present a polynomial-time algorithm that, under mild general position assumptions, can learn a rich class of depth-three ReLU networks from queries. For instance, it can learn most networks where the number of first layer neurons is smaller than the dimension and the number of second layer neurons. These two results substantially improve state-of-the-art: Until our work, polynomial-time algorithms were only shown to learn from queries depth-two networks under the assumption that either the underlying distribution is Gaussian (Chen et al. (2021)) or that the weights matrix rows are linearly independent (Milli et al. (2019)). For depth three or more, there were no known poly-time results.