SOC-PHIRSIJan 21, 2020

Random-walk Based Generative Model for Classifying Document Networks

arXiv:2001.07380v1
Originality Incremental advance
AI Analysis

This work addresses the problem of improving classification and community detection in document networks like citation networks, but it is incremental as it builds on existing generative models by adding centrality features.

The paper tackled the problem of existing generative models for document networks not fully utilizing network structures, particularly missing node centrality, by proposing a novel generative model that integrates random walkers to incorporate centrality into link generation. The result showed that the proposed model outperformed existing probabilistic approaches in semi-supervised classification tasks on real-world citation networks, especially in detecting communities.

Document networks are found in various collections of real-world data, such as citation networks, hyperlinked web pages, and online social networks. A large number of generative models have been proposed because they offer intuitive and useful pictures for analyzing document networks. Prominent examples are relational topic models, where documents are linked according to their topic similarities. However, existing generative models do not make full use of network structures because they are largely dependent on topic modeling of documents. In particular, centrality of graph nodes is missing in generative processes of previous models. In this paper, we propose a novel generative model for document networks by introducing random walkers on networks to integrate the node centrality into link generation processes. The developed method is evaluated in semi-supervised classification tasks with real-world citation networks. We show that the proposed model outperforms existing probabilistic approaches especially in detecting communities in connected networks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes