What Data is Really Necessary? A Feasibility Study of Inference Data Minimization for Recommender Systems
This work addresses the problem of implementing data minimization for recommender systems, which is crucial for legal compliance and user privacy, but it is incremental as it builds on existing techniques and highlights practical limitations.
This paper tackles the challenge of minimizing implicit feedback inference data in recommender systems to comply with data minimization principles, demonstrating that substantial data reduction is technically feasible without significant performance loss, though its practicality depends on technical settings and user characteristics.
Data minimization is a legal principle requiring personal data processing to be limited to what is necessary for a specified purpose. Operationalizing this principle for recommender systems, which rely on extensive personal data, remains a significant challenge. This paper conducts a feasibility study on minimizing implicit feedback inference data for such systems. We propose a novel problem formulation, analyze various minimization techniques, and investigate key factors influencing their effectiveness. We demonstrate that substantial inference data reduction is technically feasible without significant performance loss. However, its practicality is critically determined by two factors: the technical setting (e.g., performance targets, choice of model) and user characteristics (e.g., history size, preference complexity). Thus, while we establish its technical feasibility, we conclude that data minimization remains practically challenging and its dependence on the technical and user context makes a universal standard for data `necessity' difficult to implement.