Yuchun Qian

h-index5
2papers

2 Papers

LGApr 6, 2023Code
GIF: A General Graph Unlearning Strategy via Influence Function

Jiancan Wu, Yi Yang, Yuchun Qian et al.

With the greater emphasis on privacy and security in our society, the problem of graph unlearning -- revoking the influence of specific data on the trained GNN model, is drawing increasing attention. However, ranging from machine unlearning to recently emerged graph unlearning methods, existing efforts either resort to retraining paradigm, or perform approximate erasure that fails to consider the inter-dependency between connected neighbors or imposes constraints on GNN structure, therefore hard to achieve satisfying performance-complexity trade-offs. In this work, we explore the influence function tailored for graph unlearning, so as to improve the unlearning efficacy and efficiency for graph unlearning. We first present a unified problem formulation of diverse graph unlearning tasks \wrt node, edge, and feature. Then, we recognize the crux to the inability of traditional influence function for graph unlearning, and devise Graph Influence Function (GIF), a model-agnostic unlearning method that can efficiently and accurately estimate parameter changes in response to a $ε$-mass perturbation in deleted data. The idea is to supplement the objective of the traditional influence function with an additional loss term of the influenced neighbors due to the structural dependency. Further deductions on the closed-form solution of parameter changes provide a better understanding of the unlearning mechanism. We conduct extensive experiments on four representative GNN models and three benchmark datasets to justify the superiority of GIF for diverse graph unlearning tasks in terms of unlearning efficacy, model utility, and unlearning efficiency. Our implementations are available at \url{https://github.com/wujcan/GIF-torch/}.

MLMay 5, 2025
Resolving Memorization in Empirical Diffusion Model for Manifold Data in High-Dimensional Spaces

Yang Lyu, Tan Minh Nguyen, Yuchun Qian et al.

Diffusion models are popular tools for generating new data samples, using a forward process that adds noise to data and a reverse process to denoise and produce samples. However, when the data distribution consists of n points, empirical diffusion models tend to reproduce existing data points, a phenomenon known as the memorization effect. Current literature often addresses this with complex machine learning techniques. This work shows that the memorization issue can be solved simply by applying an inertia update at the end of the empirical diffusion simulation. Our inertial diffusion model requires only the empirical score function and no additional training. We demonstrate that the distribution of samples from this model approximates the true data distribution on a $C^2$ manifold of dimension $d$, within a Wasserstein-1 distance of order $O(n^{-\frac{2}{d+4}})$. This bound significantly shrinks the Wasserstein distance between the population and empirical distributions, confirming that the inertial diffusion model produces new and diverse samples. Remarkably, this estimate is independent of the ambient space dimension, as no further training is needed. Our analysis shows that the inertial diffusion samples resemble Gaussian kernel density estimations on the manifold, revealing a novel connection between diffusion models and manifold learning.