LG DB IR MLJul 14, 2018

ML-Schema: Exposing the Semantics of Machine Learning with Schemas and Ontologies

Gustavo Correa Publio, Diego Esteves, Agnieszka Ławrynowicz, Panče Panov, Larisa Soldatova, Tommaso Soru, Joaquin Vanschoren, Hamid Zafar

arXiv:1807.05351v18.368 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the need for standardized semantics in machine learning to enhance interoperability and interpretability, though it is incremental as it builds on existing formats and ontologies.

The paper tackles the problem of representing and interchanging machine learning information by proposing ML-Schema, a top-level ontology developed over seven years to standardize semantics for algorithms, datasets, and experiments, aiming to improve interpretability and interoperability across platforms.

The ML-Schema, proposed by the W3C Machine Learning Schema Community Group, is a top-level ontology that provides a set of classes, properties, and restrictions for representing and interchanging information on machine learning algorithms, datasets, and experiments. It can be easily extended and specialized and it is also mapped to other more domain-specific ontologies developed in the area of machine learning and data mining. In this paper we overview existing state-of-the-art machine learning interchange formats and present the first release of ML-Schema, a canonical format resulted of more than seven years of experience among different research institutions. We argue that exposing semantics of machine learning algorithms, models, and experiments through a canonical format may pave the way to better interpretability and to realistically achieve the full interoperability of experiments regardless of platform or adopted workflow solution.

View on arXiv PDF

Similar