LSCALE: Latent Space Clustering-Based Active Learning for Node Classification
This work aims to improve the efficiency of data annotation for node classification tasks, which is beneficial for researchers and practitioners dealing with large, unlabelled graph datasets.
This paper addresses the challenge of node classification on graphs with limited labels by proposing LSCALE, a latent space clustering-based active learning framework. LSCALE leverages both labelled and unlabelled node representations to select nodes for labelling, achieving consistent and significant performance improvements over state-of-the-art approaches across five datasets.
Node classification on graphs is an important task in many practical domains. It usually requires labels for training, which can be difficult or expensive to obtain in practice. Given a budget for labelling, active learning aims to improve performance by carefully choosing which nodes to label. Previous graph active learning methods learn representations using labelled nodes and select some unlabelled nodes for label acquisition. However, they do not fully utilize the representation power present in unlabelled nodes. We argue that the representation power in unlabelled nodes can be useful for active learning and for further improving performance of active learning for node classification. In this paper, we propose a latent space clustering-based active learning framework for node classification (LSCALE), where we fully utilize the representation power in both labelled and unlabelled nodes. Specifically, to select nodes for labelling, our framework uses the K-Medoids clustering algorithm on a latent space based on a dynamic combination of both unsupervised features and supervised features. In addition, we design an incremental clustering module to avoid redundancy between nodes selected at different steps. Extensive experiments on five datasets show that our proposed framework LSCALE consistently and significantly outperforms the stateof-the-art approaches by a large margin.