LGApr 29, 2025

Graph Synthetic Out-of-Distribution Exposure with Large Language Models

arXiv:2504.21198v26 citationsh-index: 13
Originality Incremental advance
AI Analysis

This addresses the challenge of ensuring model robustness in open-world graph applications, offering a practical solution for safety-sensitive domains where real OOD data is costly or unavailable, though it is incremental as it builds on existing OOD exposure techniques.

The paper tackles the problem of out-of-distribution (OOD) detection in graphs by proposing GOE-LLM, a framework that uses Large Language Models to generate synthetic OOD nodes without real OOD data, resulting in up to a 23.5% improvement in AUROC compared to methods without OOD exposure.

Out-of-distribution (OOD) detection in graphs is critical for ensuring model robustness in open-world and safety-sensitive applications. Existing graph OOD detection approaches typically train an in-distribution (ID) classifier on ID data alone, then apply post-hoc scoring to detect OOD instances. While OOD exposure - adding auxiliary OOD samples during training - can improve detection, current graph-based methods often assume access to real OOD nodes, which is often impractical or costly. In this paper, we present GOE-LLM, a framework that leverages Large Language Models (LLMs) to achieve OOD exposure on text-attributed graphs without using any real OOD nodes. GOE-LLM introduces two pipelines: (1) identifying pseudo-OOD nodes from the initially unlabeled graph using zero-shot LLM annotations, and (2) generating semantically informative synthetic OOD nodes via LLM-prompted text generation. These pseudo-OOD nodes are then used to regularize ID classifier training and enhance OOD detection awareness. Empirical results on multiple benchmarks show that GOE-LLM substantially outperforms state-of-the-art methods without OOD exposure, achieving up to a 23.5% improvement in AUROC for OOD detection, and attains performance on par with those relying on real OOD labels for exposure.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes