SQL for SRL: Structure Learning Inside a Database System
This work addresses the challenge of scalable structure learning in statistical-relational models for database and machine learning practitioners, offering a novel integration but is incremental in applying existing database capabilities.
The paper tackles the problem of statistical-relational learning by proposing relational algebra and SQL as a unified language for representing and computing with statistical-relational objects, similar to linear algebra in traditional machine learning, and demonstrates through the FACTORBASE system that this approach enables scalable model structure learning across six benchmark databases.
The position we advocate in this paper is that relational algebra can provide a unified language for both representing and computing with statistical-relational objects, much as linear algebra does for traditional single-table machine learning. Relational algebra is implemented in the Structured Query Language (SQL), which is the basis of relational database management systems. To support our position, we have developed the FACTORBASE system, which uses SQL as a high-level scripting language for statistical-relational learning of a graphical model structure. The design philosophy of FACTORBASE is to manage statistical models as first-class citizens inside a database. Our implementation shows how our SQL constructs in FACTORBASE facilitate fast, modular, and reliable program development. Empirical evidence from six benchmark databases indicates that leveraging database system capabilities achieves scalable model structure learning.