An alternative proof of the vulnerability of retrieval in high intrinsic dimensionality neighborhood
This addresses security concerns in data analysis and machine learning for practitioners, but it is incremental as it provides an alternative proof and validation of an existing vulnerability model.
The paper tackles the vulnerability of nearest neighbor search to adversarial perturbations by deriving a statistical model for the amount of perturbation needed to alter neighbor ranks, and validates it experimentally on six large-scale datasets with explanations for outliers.
This paper investigates the vulnerability of the nearest neighbors search, which is a pivotal tool in data analysis and machine learning. The vulnerability is gauged as the relative amount of perturbation that an attacker needs to add onto a dataset point in order to modify its neighbor rank w.r.t. a query. The statistical distribution of this quantity is derived from simple assumptions. Experiments on six large scale datasets validate this model up to some outliers which are explained in term of violations of the assumptions.