Explainable Subgraphs with Surprising Densities: A Subgroup Discovery Approach
This work addresses the need for actionable insights in graph analysis, such as in social networks, by providing explainable patterns, though it is incremental as it builds on existing subgroup discovery approaches.
The paper tackles the problem of discovering interpretable patterns in graph connectivity by identifying pairs of node subgroups with surprisingly high or low edge densities, using an information-theoretic measure of interestingness that accounts for prior analyst knowledge. It generalizes prior dense subgraph mining methods and shows practical advantages, including iterative mining capabilities.
The connectivity structure of graphs is typically related to the attributes of the nodes. In social networks for example, the probability of a friendship between two people depends on their attributes, such as their age, address, and hobbies. The connectivity of a graph can thus possibly be understood in terms of patterns of the form 'the subgroup of individuals with properties X are often (or rarely) friends with individuals in another subgroup with properties Y'. Such rules present potentially actionable and generalizable insights into the graph. We present a method that finds pairs of node subgroups between which the edge density is interestingly high or low, using an information-theoretic definition of interestingness. This interestingness is quantified subjectively, to contrast with prior information an analyst may have about the graph. This view immediately enables iterative mining of such patterns. Our work generalizes prior work on dense subgraph mining (i.e. subgraphs induced by a single subgroup). Moreover, not only is the proposed method more general, we also demonstrate considerable practical advantages for the single subgroup special case.