API design for machine learning software: experiences from the scikit-learn project
This work addresses API design challenges for machine learning software developers and users, but it is incremental as it builds on existing library practices.
The paper discusses the design choices for the scikit-learn API, focusing on creating a simple and elegant interface for learning and processing units to enhance composition and reusability, and analyzes implementation details and user/developer obstacles.
Scikit-learn is an increasingly popular machine learning li- brary. Written in Python, it is designed to be simple and efficient, accessible to non-experts, and reusable in various contexts. In this paper, we present and discuss our design choices for the application programming interface (API) of the project. In particular, we describe the simple and elegant interface shared by all learning and processing units in the library and then discuss its advantages in terms of composition and reusability. The paper also comments on implementation details specific to the Python ecosystem and analyzes obstacles faced by users and developers of the library.