Stochastic Gradient Descent for Relational Logistic Regression via Partial Network Crawls
This work addresses the challenge of inaccurate parameter estimates in relational learning when data is collected via network crawls, which is common in practice due to access and privacy constraints, representing an incremental extension of prior methods.
The authors tackled the problem of learning relational logistic regression models from partial network crawls, which is more realistic than assuming full data access, and showed that their proposed stochastic gradient descent method yields accurate parameter estimates and confidence intervals.
Research in statistical relational learning has produced a number of methods for learning relational models from large-scale network data. While these methods have been successfully applied in various domains, they have been developed under the unrealistic assumption of full data access. In practice, however, the data are often collected by crawling the network, due to proprietary access, limited resources, and privacy concerns. Recently, we showed that the parameter estimates for relational Bayes classifiers computed from network samples collected by existing network crawlers can be quite inaccurate, and developed a crawl-aware estimation method for such models (Yang, Ribeiro, and Neville, 2017). In this work, we extend the methodology to learning relational logistic regression models via stochastic gradient descent from partial network crawls, and show that the proposed method yields accurate parameter estimates and confidence intervals.