ML LG MEJun 27, 2021

Interpretable Network Representation Learning with Principal Component Analysis

arXiv:2106.14238v11.9Has Code

Originality Incremental advance

AI Analysis

This work addresses the need for interpretable methods in network analysis, particularly for researchers handling network-valued data like brain connectivity or social networks, though it appears incremental by building on existing literature.

The authors tackled the problem of interpretable network representation learning by proposing the PCAN and sPCAN algorithms, which use subgraph count statistics to provide low-dimensional representations for network samples, and demonstrated their effectiveness in visualizing, clustering, and classifying real-world network data such as functional connectivity and political co-voting networks.

We consider the problem of interpretable network representation learning for samples of network-valued data. We propose the Principal Component Analysis for Networks (PCAN) algorithm to identify statistically meaningful low-dimensional representations of a network sample via subgraph count statistics. The PCAN procedure provides an interpretable framework for which one can readily visualize, explore, and formulate predictive models for network samples. We furthermore introduce a fast sampling-based algorithm, sPCAN, which is significantly more computationally efficient than its counterpart, but still enjoys advantages of interpretability. We investigate the relationship between these two methods and analyze their large-sample properties under the common regime where the sample of networks is a collection of kernel-based random graphs. We show that under this regime, the embeddings of the sPCAN method enjoy a central limit theorem and moreover that the population level embeddings of PCAN and sPCAN are equivalent. We assess PCAN's ability to visualize, cluster, and classify observations in network samples arising in nature, including functional connectivity network samples and dynamic networks describing the political co-voting habits of the U.S. Senate. Our analyses reveal that our proposed algorithm provides informative and discriminatory features describing the networks in each sample. The PCAN and sPCAN methods build on the current literature of network representation learning and set the stage for a new line of research in interpretable learning on network-valued data. Publicly available software for the PCAN and sPCAN methods are available at https://www.github.com/jihuilee/.

View on arXiv PDF Code

Similar