IR CY LGSep 21, 2024

The trade-off between data minimization and fairness in collaborative filtering

Nasim Sonboli, Sipei Li, Mehdi Elahi, Asia Biega

arXiv:2410.07182v12.23 citationsh-index: 25

Originality Synthesis-oriented

AI Analysis

This addresses GDPR compliance challenges for recommender systems, though it is incremental as it builds on existing active learning methods.

The paper investigates the trade-off between data minimization and fairness in recommender systems, finding that active learning strategies can maintain accuracy but often reduce fairness.

General Data Protection Regulations (GDPR) aim to safeguard individuals' personal information from harm. While full compliance is mandatory in the European Union and the California Privacy Rights Act (CPRA), it is not in other places. GDPR requires simultaneous compliance with all the principles such as fairness, accuracy, and data minimization. However, it overlooks the potential contradictions within its principles. This matter gets even more complex when compliance is required from decision-making systems. Therefore, it is essential to investigate the feasibility of simultaneously achieving the goals of GDPR and machine learning, and the potential tradeoffs that might be forced upon us. This paper studies the relationship between the principles of data minimization and fairness in recommender systems. We operationalize data minimization via active learning (AL) because, unlike many other methods, it can preserve a high accuracy while allowing for strategic data collection, hence minimizing the amount of data collection. We have implemented several active learning strategies (personalized and non-personalized) and conducted a comparative analysis focusing on accuracy and fairness on two publicly available datasets. The results demonstrate that different AL strategies may have different impacts on the accuracy of recommender systems with nearly all strategies negatively impacting fairness. There has been no to very limited work on the trade-off between data minimization and fairness, the pros and cons of active learning methods as tools for implementing data minimization, and the potential impacts of AL on fairness. By exploring these critical aspects, we offer valuable insights for developing recommender systems that are GDPR compliant.

View on arXiv PDF

Similar