Machine Unlearning: Solutions and Challenges
It addresses privacy and security risks in machine learning for researchers and practitioners, but is incremental as it reviews existing methods rather than introducing new ones.
This paper tackles the problem of machine learning models memorizing sensitive data by providing a comprehensive taxonomy and analysis of machine unlearning solutions, categorizing them into exact and approximate approaches to remove data influence.
Machine learning models may inadvertently memorize sensitive, unauthorized, or malicious data, posing risks of privacy breaches, security vulnerabilities, and performance degradation. To address these issues, machine unlearning has emerged as a critical technique to selectively remove specific training data points' influence on trained models. This paper provides a comprehensive taxonomy and analysis of the solutions in machine unlearning. We categorize existing solutions into exact unlearning approaches that remove data influence thoroughly and approximate unlearning approaches that efficiently minimize data influence. By comprehensively reviewing solutions, we identify and discuss their strengths and limitations. Furthermore, we propose future directions to advance machine unlearning and establish it as an essential capability for trustworthy and adaptive machine learning models. This paper provides researchers with a roadmap of open problems, encouraging impactful contributions to address real-world needs for selective data removal.