Flexible Generation of Preference Data for Recommendation Analysis
This work addresses the need for flexible synthetic data generation to test recommendation systems in controlled environments, though it appears incremental as it builds on existing simulation approaches.
The paper tackled the problem of generating synthetic preference data for recommendation system analysis by proposing HYDRA, a model that mimics real-world patterns through user-item interactions, item popularity, and user engagement, and demonstrated its effectiveness in replicating data patterns on benchmark datasets.
Simulating a recommendation system in a controlled environment, to identify specific behaviors and user preferences, requires highly flexible synthetic data generation models capable of mimicking the patterns and trends of real datasets. In this context, we propose HYDRA, a novel preferences data generation model driven by three main factors: user-item interaction level, item popularity, and user engagement level. The key innovations of the proposed process include the ability to generate user communities characterized by similar item adoptions, reflecting real-world social influences and trends. Additionally, HYDRA considers item popularity and user engagement as mixtures of different probability distributions, allowing for a more realistic simulation of diverse scenarios. This approach enhances the model's capacity to simulate a wide range of real-world cases, capturing the complexity and variability found in actual user behavior. We demonstrate the effectiveness of HYDRA through extensive experiments on well-known benchmark datasets. The results highlight its capability to replicate real-world data patterns, offering valuable insights for developing and testing recommendation systems in a controlled and realistic manner. The code used to perform the experiments is publicly available at https://github.com/SimoneMungari/HYDRA.