LGAIMLOct 2, 2022

Metric Distribution to Vector: Constructing Data Representation via Broad-Scale Discrepancies

arXiv:2210.00415v11 citationsh-index: 37
Originality Highly original
AI Analysis

This addresses the problem of graph classification for researchers and practitioners by providing a novel embedding approach that improves performance over existing methods.

The authors tackled the problem of graph classification by proposing MetricDistribution2vec, a novel embedding strategy that captures broad-scale metric distribution characteristics to create vector representations. The method achieved unexpected performance increases across all tested real-world graph datasets, even with lightweight classifiers, and showed attractive discrimination in few-shot learning scenarios.

Graph embedding provides a feasible methodology to conduct pattern classification for graph-structured data by mapping each data into the vectorial space. Various pioneering works are essentially coding method that concentrates on a vectorial representation about the inner properties of a graph in terms of the topological constitution, node attributions, link relations, etc. However, the classification for each targeted data is a qualitative issue based on understanding the overall discrepancies within the dataset scale. From the statistical point of view, these discrepancies manifest a metric distribution over the dataset scale if the distance metric is adopted to measure the pairwise similarity or dissimilarity. Therefore, we present a novel embedding strategy named $\mathbf{MetricDistribution2vec}$ to extract such distribution characteristics into the vectorial representation for each data. We demonstrate the application and effectiveness of our representation method in the supervised prediction tasks on extensive real-world structural graph datasets. The results have gained some unexpected increases compared with a surge of baselines on all the datasets, even if we take the lightweight models as classifiers. Moreover, the proposed methods also conducted experiments in Few-Shot classification scenarios, and the results still show attractive discrimination in rare training samples based inference.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes