Towards Sharing Task Environments to Support Reproducible Evaluations of Interactive Recommender Systems
This work provides incremental groundwork for the recommender systems community to standardize and improve reproducibility in evaluations.
The paper tackles the problem of achieving reproducible experiments in interactive recommender systems by proposing a high-level logical architecture for sharing task environments, rather than just datasets or simulations.
Beyond sharing datasets or simulations, we believe the Recommender Systems (RS) community should share Task Environments. In this work, we propose a high-level logical architecture that will help to reason about the core components of a RS Task Environment, identify the differences between Environments, datasets and simulations; and most importantly, understand what needs to be shared about Environments to achieve reproducible experiments. The work presents itself as valuable initial groundwork, open to discussion and extensions.