LG AINov 15, 2025

Improving Graph Embeddings in Machine Learning Using Knowledge Completion with Validation in a Case Study on COVID-19 Spread

Rosario Napoli, Gabriele Morabito, Antonio Celesti, Massimo Villari, Maria Fazio

arXiv:2511.12071v1h-index: 332025 IEEE International Conference on Knowledge Graph (ICKG)

Originality Incremental advance

AI Analysis

This work addresses a domain-specific issue in graph machine learning for tasks like node classification and link prediction, offering an incremental improvement.

The paper tackled the problem of graph embeddings missing implicit knowledge in sparse datasets by integrating a knowledge completion phase to uncover latent semantics, which significantly altered embedding space geometry and redefined representation quality.

The rise of graph-structured data has driven major advances in Graph Machine Learning (GML), where graph embeddings (GEs) map features from Knowledge Graphs (KGs) into vector spaces, enabling tasks like node classification and link prediction. However, since GEs are derived from explicit topology and features, they may miss crucial implicit knowledge hidden in seemingly sparse datasets, affecting graph structure and their representation. We propose a GML pipeline that integrates a Knowledge Completion (KC) phase to uncover latent dataset semantics before embedding generation. Focusing on transitive relations, we model hidden connections with decay-based inference functions, reshaping graph topology, with consequences on embedding dynamics and aggregation processes in GraphSAGE and Node2Vec. Experiments show that our GML pipeline significantly alters the embedding space geometry, demonstrating that its introduction is not just a simple enrichment but a transformative step that redefines graph representation quality.

View on arXiv PDF

Similar