Pruning as a Defense: Reducing Memorization in Large Language Models
This addresses privacy and security concerns for users of large language models, though it is incremental as it applies existing pruning methods to a new problem.
The paper tackled the problem of memorization in large language models by investigating simple pruning techniques, finding that pruning effectively reduces memorization and can mitigate membership inference attacks.
Large language models have been shown to memorize significant portions of their training data, which they can reproduce when appropriately prompted. This work investigates the impact of simple pruning techniques on this behavior. Our findings reveal that pruning effectively reduces the extent of memorization in LLMs, demonstrating its potential as a foundational approach for mitigating membership inference attacks.