Understanding Deep Gradient Leakage via Inversion Influence Functions
This work addresses privacy vulnerabilities in distributed learning for clients with sensitive data, offering a tool to analyze and potentially defend against gradient leakage attacks, though it is incremental in providing insights rather than a direct defense.
The paper tackles the problem of Deep Gradient Leakage (DGL), an attack that recovers private training images from gradients in distributed learning, by proposing Inversion Influence Functions (I^2F) to establish a closed-form connection between recovered images and gradients, empirically showing it effectively approximates DGL across various settings.
Deep Gradient Leakage (DGL) is a highly effective attack that recovers private training images from gradient vectors. This attack casts significant privacy challenges on distributed learning from clients with sensitive data, where clients are required to share gradients. Defending against such attacks requires but lacks an understanding of when and how privacy leakage happens, mostly because of the black-box nature of deep networks. In this paper, we propose a novel Inversion Influence Function (I$^2$F) that establishes a closed-form connection between the recovered images and the private gradients by implicitly solving the DGL problem. Compared to directly solving DGL, I$^2$F is scalable for analyzing deep networks, requiring only oracle access to gradients and Jacobian-vector products. We empirically demonstrate that I$^2$F effectively approximated the DGL generally on different model architectures, datasets, modalities, attack implementations, and perturbation-based defenses. With this novel tool, we provide insights into effective gradient perturbation directions, the unfairness of privacy protection, and privacy-preferred model initialization. Our codes are provided in https://github.com/illidanlab/inversion-influence-function.