LGOct 28, 2014

Fast Learning of Relational Dependency Networks

Oliver Schulte, Zhensong Qian, Arthur E. Kirkpatrick, Xiaoqian Yin, Yan Sun

arXiv:1410.7835v213 citations

AI Analysis

This work addresses the scalability issue in learning RDNs for multi-relational data, which is important for applications in domains like social networks or bioinformatics, but it is incremental as it builds on existing Bayesian network and RDN techniques.

The paper tackles the problem of efficiently learning Relational Dependency Networks (RDNs) from large relational databases by first learning a Bayesian network and then transforming it into an RDN, enabling learning on a dataset with a million tuples in minutes. The result shows that this method scales much better to large datasets than state-of-the-art boosting methods while providing competitive prediction accuracy.

A Relational Dependency Network (RDN) is a directed graphical model widely used for multi-relational data. These networks allow cyclic dependencies, necessary to represent relational autocorrelations. We describe an approach for learning both the RDN's structure and its parameters, given an input relational database: First learn a Bayesian network (BN), then transform the Bayesian network to an RDN. Thus fast Bayes net learning can provide fast RDN learning. The BN-to-RDN transform comprises a simple, local adjustment of the Bayes net structure and a closed-form transform of the Bayes net parameters. This method can learn an RDN for a dataset with a million tuples in minutes. We empirically compare our approach to state-of-the art RDN learning methods that use functional gradient boosting, on five benchmark datasets. Learning RDNs via BNs scales much better to large datasets than learning RDNs with boosting, and provides competitive accuracy in predictions.

View on arXiv PDF

Similar