MEAPCOMLJan 15, 2021

A practical test for a planted community in heterogeneous networks

arXiv:2101.05928v1
Originality Incremental advance
AI Analysis

This addresses a fundamental task in graph data mining for applications like biology and finance, but it is incremental as it builds on prior work to improve computational efficiency.

The paper tackles the problem of detecting a planted community (dense subgraph) in heterogeneous networks, where existing tests are computationally impractical, and proposes a polynomial-time test with theoretical guarantees and evaluated performance via simulations and real data.

One of the fundamental task in graph data mining is to find a planted community(dense subgraph), which has wide application in biology, finance, spam detection and so on. For a real network data, the existence of a dense subgraph is generally unknown. Statistical tests have been devised to testing the existence of dense subgraph in a homogeneous random graph. However, many networks present extreme heterogeneity, that is, the degrees of nodes or vertexes don't concentrate on a typical value. The existing tests designed for homogeneous random graph are not straightforwardly applicable to the heterogeneous case. Recently, scan test was proposed for detecting a dense subgraph in heterogeneous(inhomogeneous) graph(\cite{BCHV19}). However, the computational complexity of the scan test is generally not polynomial in the graph size, which makes the test impractical for large or moderate networks. In this paper, we propose a polynomial-time test that has the standard normal distribution as the null limiting distribution. The power of the test is theoretically investigated and we evaluate the performance of the test by simulation and real data example.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes