LGDec 13, 2023

ERASE: Error-Resilient Representation Learning on Graphs for Label Noise Tolerance

Ling-Hao Chen, Yuanshuo Zhang, Taohua Huang, Liangcai Su, Zeyi Lin, Xi Xiao, Xiaobo Xia, Tongliang Liu

Tsinghua

arXiv:2312.08852v213.018 citationsh-index: 22Has CodeCIKM

Originality Incremental advance

AI Analysis

This addresses the challenge of noisy labels in graph data for deep learning applications, offering a robust solution that is incremental but effective for domain-specific tasks.

The paper tackles the problem of label noise in graph-based tasks by proposing ERASE, a method that learns error-resilient representations through decoupled label propagation and structural denoising, resulting in improved generalization performance in node classification with clear margins over baselines across broad noise levels.

Deep learning has achieved remarkable success in graph-related tasks, yet this accomplishment heavily relies on large-scale high-quality annotated datasets. However, acquiring such datasets can be cost-prohibitive, leading to the practical use of labels obtained from economically efficient sources such as web searches and user tags. Unfortunately, these labels often come with noise, compromising the generalization performance of deep networks. To tackle this challenge and enhance the robustness of deep learning models against label noise in graph-based tasks, we propose a method called ERASE (Error-Resilient representation learning on graphs for lAbel noiSe tolerancE). The core idea of ERASE is to learn representations with error tolerance by maximizing coding rate reduction. Particularly, we introduce a decoupled label propagation method for learning representations. Before training, noisy labels are pre-corrected through structural denoising. During training, ERASE combines prototype pseudo-labels with propagated denoised labels and updates representations with error resilience, which significantly improves the generalization performance in node classification. The proposed method allows us to more effectively withstand errors caused by mislabeled nodes, thereby strengthening the robustness of deep networks in handling noisy graph data. Extensive experimental results show that our method can outperform multiple baselines with clear margins in broad noise levels and enjoy great scalability. Codes are released at https://github.com/eraseai/erase.

View on arXiv PDF Code

Similar