On the (In)Feasibility of Attribute Inference Attacks on Machine Learning Models
This work addresses privacy concerns for users of machine learning APIs by clarifying the feasibility of attribute inference attacks, though it is incremental as it builds on existing attack literature.
The paper investigates attribute inference attacks on machine learning models, showing that while such attacks are generally infeasible due to limitations in membership inference, approximate attribute inference can still infer attributes close to the true ones, with results verified on three datasets and multiple attack methods.
With an increase in low-cost machine learning APIs, advanced machine learning models may be trained on private datasets and monetized by providing them as a service. However, privacy researchers have demonstrated that these models may leak information about records in the training dataset via membership inference attacks. In this paper, we take a closer look at another inference attack reported in literature, called attribute inference, whereby an attacker tries to infer missing attributes of a partially known record used in the training dataset by accessing the machine learning model as an API. We show that even if a classification model succumbs to membership inference attacks, it is unlikely to be susceptible to attribute inference attacks. We demonstrate that this is because membership inference attacks fail to distinguish a member from a nearby non-member. We call the ability of an attacker to distinguish the two (similar) vectors as strong membership inference. We show that membership inference attacks cannot infer membership in this strong setting, and hence inferring attributes is infeasible. However, under a relaxed notion of attribute inference, called approximate attribute inference, we show that it is possible to infer attributes close to the true attributes. We verify our results on three publicly available datasets, five membership, and three attribute inference attacks reported in literature.