LG AI DBJul 21, 2025

Beyond Model Base Selection: Weaving Knowledge to Master Fine-grained Neural Network Design

Jialiang Wang, Hanmo Liu, Shimin Di, Zhili Wang, Jiachuan Wang, Lei Chen, Xiaofang Zhou

arXiv:2507.15336v14.1h-index: 10

Originality Incremental advance

AI Analysis

This work addresses the need for more effective model refinement in database research for graph analytics tasks, though it is incremental by building on existing model base approaches.

The paper tackles the problem of suboptimal model selection in database systems by proposing M-DESIGN, a model knowledge base pipeline that adaptively refines neural networks for fine-grained tasks, achieving optimal models in 26 out of 33 data-task pairs within limited budgets.

Database systems have recently advocated for embedding machine learning (ML) capabilities, offering declarative model queries over large, managed model repositories, thereby circumventing the huge computational overhead of traditional ML-based algorithms in automated neural network model selection. Pioneering database studies aim to organize existing benchmark repositories as model bases (MB), querying them for the model records with the highest performance estimation metrics for given tasks. However, this static model selection practice overlooks the fine-grained, evolving relational dependencies between diverse task queries and model architecture variations, resulting in suboptimal matches and failing to further refine the model effectively. To fill the model refinement gap in database research, we propose M-DESIGN, a curated model knowledge base (MKB) pipeline for mastering neural network refinement by adaptively weaving prior insights about model architecture modification. First, we propose a knowledge weaving engine that reframes model refinement as an adaptive query problem over task metadata. Given a user's task query, M-DESIGN quickly matches and iteratively refines candidate models by leveraging a graph-relational knowledge schema that explicitly encodes data properties, architecture variations, and pairwise performance deltas as joinable relations. This schema supports fine-grained relational analytics over architecture tweaks and drives a predictive query planner that can detect and adapt to out-of-distribution (OOD) tasks. We instantiate M-DESIGN for graph analytics tasks, where our model knowledge base enriches existing benchmarks with structured metadata covering 3 graph tasks and 22 graph datasets, contributing data records of 67,760 graph models. Empirical results demonstrate that M-DESIGN delivers the optimal model in 26 of 33 data-task pairs within limited budgets.

View on arXiv PDF

Similar