Formalizing Distribution Inference Risks
This work addresses security risks in machine learning for privacy-sensitive applications, but it is incremental as it builds on existing frameworks.
The paper tackles the problem of distinguishing property inference attacks from legitimate statistical learning by proposing a formal definition that extends beyond previous attacks, and demonstrates this with a new attack revealing the average node degree in training graphs, supported by experimental insights.
Property inference attacks reveal statistical properties about a training set but are difficult to distinguish from the primary purposes of statistical machine learning, which is to produce models that capture statistical properties about a distribution. Motivated by Yeom et al.'s membership inference framework, we propose a formal and generic definition of property inference attacks. The proposed notion describes attacks that can distinguish between possible training distributions, extending beyond previous property inference attacks that infer the ratio of a particular type of data in the training data set. In this paper, we show how our definition captures previous property inference attacks as well as a new attack that reveals the average degree of nodes of a training graph and report on experiments giving insight into the potential risks of property inference attacks.