LGAug 28, 2025

Khiops: An End-to-End, Frugal AutoML and XAI Machine Learning Solution for Large, Multi-Table Databases

arXiv:2508.20519v3h-index: 11Has Code
Originality Incremental advance
AI Analysis

This tool addresses the problem of analyzing complex, large-scale databases for data scientists and analysts, though it appears incremental as it builds on existing Bayesian methods with specific adaptations.

Khiops tackles the challenge of mining large multi-table databases by providing an end-to-end AutoML and XAI solution, achieving scalability for millions of individuals and tens of thousands of variables with a unique Bayesian approach that includes variable selection and weight learning.

Khiops is an open source machine learning tool designed for mining large multi-table databases. Khiops is based on a unique Bayesian approach that has attracted academic interest with more than 20 publications on topics such as variable selection, classification, decision trees and co-clustering. It provides a predictive measure of variable importance using discretisation models for numerical data and value clustering for categorical data. The proposed classification/regression model is a naive Bayesian classifier incorporating variable selection and weight learning. In the case of multi-table databases, it provides propositionalisation by automatically constructing aggregates. Khiops is adapted to the analysis of large databases with millions of individuals, tens of thousands of variables and hundreds of millions of records in secondary tables. It is available on many environments, both from a Python library and via a user interface.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes