Towards Improved Illicit Node Detection with Positive-Unlabelled Learning
This work tackles the problem of improving illicit node detection for blockchain regulation, but it appears incremental as it builds on existing PU learning approaches with a focus on label assumptions and feature distributions.
The paper addresses the challenge of detecting illicit nodes in blockchain networks by examining the impact of hidden positive labels in positive-unlabelled learning, finding that PU classifiers can outperform regular machine learning models when combined with graph representation learning methods.
Detecting illicit nodes on blockchain networks is a valuable task for strengthening future regulation. Recent machine learning-based methods proposed to tackle the tasks are using some blockchain transaction datasets with a small portion of samples labeled positive and the rest unlabelled (PU). Albeit the assumption that a random sample of unlabeled nodes are normal nodes is used in some works, we discuss that the label mechanism assumption for the hidden positive labels and its effect on the evaluation metrics is worth considering. We further explore that PU classifiers dealing with potential hidden positive labels can have improved performance compared to regular machine learning models. We test the PU classifiers with a list of graph representation learning methods for obtaining different feature distributions for the same data to have more reliable results.