LGHCJun 25, 2025

Domain Knowledge in Artificial Intelligence: Using Conceptual Modeling to Increase Machine Learning Accuracy and Explainability

arXiv:2507.02922v12 citationsh-index: 42Data Knowl Eng
Originality Synthesis-oriented
AI Analysis

This addresses the problem of limited domain knowledge use in machine learning for data scientists and practitioners, but it appears incremental as it builds on existing conceptual modeling techniques.

The research tackled performance and transparency issues in machine learning by proposing a method called Conceptual Modeling for Machine Learning (CMML) that uses domain knowledge from conceptual models to improve data preparation, and demonstrated its value in improving outcomes through application to real-world problems and assessments by data scientists.

Machine learning enables the extraction of useful information from large, diverse datasets. However, despite many successful applications, machine learning continues to suffer from performance and transparency issues. These challenges can be partially attributed to the limited use of domain knowledge by machine learning models. This research proposes using the domain knowledge represented in conceptual models to improve the preparation of the data used to train machine learning models. We develop and demonstrate a method, called the Conceptual Modeling for Machine Learning (CMML), which is comprised of guidelines for data preparation in machine learning and based on conceptual modeling constructs and principles. To assess the impact of CMML on machine learning outcomes, we first applied it to two real-world problems to evaluate its impact on model performance. We then solicited an assessment by data scientists on the applicability of the method. These results demonstrate the value of CMML for improving machine learning outcomes.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes