LGAIJan 31, 2025

A Metric for the Balance of Information in Graph Learning

arXiv:2501.19137v1h-index: 16
Originality Incremental advance
AI Analysis

This work addresses an open issue in graph learning for molecular datasets, helping researchers determine the optimal bias towards structure or features, though it is incremental as it builds on existing biasing methods.

The authors tackled the problem of identifying whether molecular graph datasets favor structural or feature information by proposing the Noise-Noise Ratio Difference (NNRD) metric, which quantitatively measures information degradation through iterative noising and shows intuitive results across various tasks.

Graph learning on molecules makes use of information from both the molecular structure and the features attached to that structure. Much work has been conducted on biasing either towards structure or features, with the aim that bias bolsters performance. Identifying which information source a dataset favours, and therefore how to approach learning that dataset, is an open issue. Here we propose Noise-Noise Ratio Difference (NNRD), a quantitative metric for whether there is more useful information in structure or features. By employing iterative noising on features and structure independently, leaving the other intact, NNRD measures the degradation of information in each. We employ NNRD over a range of molecular tasks, and show that it corresponds well to a loss of information, with intuitive results that are more expressive than simple performance aggregates. Our future work will focus on expanding data domains, tasks and types, as well as refining our choice of baseline model.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes