PLLGApr 15, 2016

ModelWizard: Toward Interactive Model Construction

arXiv:1604.04639v14 citations
Originality Incremental advance
AI Analysis

This addresses the need for more exploratory tools in machine learning model building, offering a new paradigm for data scientists, though it is incremental in its focus on interactive construction.

The paper tackles the problem of interactive model construction for data scientists by proposing a framework with composable operations, and demonstrates its implementation in ModelWizard, a domain-specific language in F# for building tabular models.

Data scientists engage in model construction to discover machine learning models that well explain a dataset, in terms of predictiveness, understandability and generalization across domains. Questions such as "what if we model common cause Z" and "what if Y's dependence on X reverses" inspire many candidate models to consider and compare, yet current tools emphasize constructing a final model all at once. To more naturally reflect exploration when debating numerous models, we propose an interactive model construction framework grounded in composable operations. Primitive operations capture core steps refining data and model that, when verified, form an inductive basis to prove model validity. Derived, composite operations enable advanced model families, both generic and specialized, abstracted away from low-level details. We prototype our envisioned framework in ModelWizard, a domain-specific language embedded in F# to construct Tabular models. We enumerate language design and demonstrate its use through several applications, emphasizing how language may facilitate creation of complex models. To future engineers designing data science languages and tools, we offer ModelWizard's design as a new model construction paradigm, speeding discovery of our universe's structure.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes