You Do Not Need a Bigger Boat: Recommendations at Reasonable Scale in a (Mostly) Serverless and Open Stack
This addresses the challenge for industry practitioners in efficiently implementing recommender systems, though it is incremental as it builds on existing serverless and open-source technologies.
The paper tackles the problem of immature data pipelines hindering industry adoption of recommender systems research by proposing a serverless, open-source template stack for machine learning at reasonable scale, demonstrating its ability to process terabytes of data with minimal infrastructure effort.
We argue that immature data pipelines are preventing a large portion of industry practitioners from leveraging the latest research on recommender systems. We propose our template data stack for machine learning at "reasonable scale", and show how many challenges are solved by embracing a serverless paradigm. Leveraging our experience, we detail how modern open source can provide a pipeline processing terabytes of data with limited infrastructure work.