Louis Van Langendonck

LG
h-index29
4papers
9citations
Novelty54%
AI Score45

4 Papers

52.8LGMay 15Code
OgBench: A Framework for Evaluating Graph Neural Networks on Omics Data

Louisa Cornelis, Johan Mathe, Louis Van Langendonck et al.

Graph Neural Networks (GNNs) have become the dominant framework for inductive graph-level learning. Yet most benchmarks focus on the regime $n \gg p$, where the number of graphs $n$ greatly exceeds the number of nodes per graph $p$. This overlooks biological domains such as omics, which operate in the opposite $n \ll p$ regime, characterized by large graphs of genes, transcripts, or proteins across few patient samples. This raises the question: \textit{how do GNNs perform in this low-sample, high-node omics setting?} We introduce \texttt{OgBench} (Omics-Graph Bench), the first benchmarking platform for graph-level prediction in the $n \ll p$ regime characteristic of omics data. We provide a standardized, end-to-end modular infrastructure from raw omics data to families of featured graphs with varied structural properties. We benchmark classical GNNs, as well as GNNs designed for large graphs and omics applications, alongside MLPs and machine learning baselines to establish reference performances. Our results show that widely used GNNs often do not outperform simple MLPs and classical baselines. These findings challenge the prevailing assumption that graph structure inherently adds value in this domain, fostering a critical reassessment of current learning paradigms. Ultimately, by exposing these limitations, OgBench provides the open-source ecosystem necessary for the community to develop and validate novel architectures explicitly tailored for biological graphs. The code is available at https://github.com/geometric-intelligence/ogbench.

LGSep 12, 2024
Towards a graph-based foundation model for network traffic analysis

Louis Van Langendonck, Ismael Castell-Uroz, Pere Barlet-Ros

Foundation models have shown great promise in various fields of study. A potential application of such models is in computer network traffic analysis, where these models can grasp the complexities of network traffic dynamics and adapt to any specific task or network environment with minimal fine-tuning. Previous approaches have used tokenized hex-level packet data and the model architecture of large language transformer models. We propose a new, efficient graph-based alternative at the flow-level. Our approach represents network traffic as a dynamic spatio-temporal graph, employing a self-supervised link prediction pretraining task to capture the spatial and temporal dynamics in this network graph framework. To evaluate the effectiveness of our approach, we conduct a few-shot learning experiment for three distinct downstream network tasks: intrusion detection, traffic classification, and botnet classification. Models finetuned from our pretrained base achieve an average performance increase of 6.87\% over training from scratch, demonstrating their ability to effectively learn general network traffic dynamics during pretraining. This success suggests the potential for a large-scale version to serve as an operational foundational model.

LGSep 25, 2025
GraphUniverse: Enabling Systematic Evaluation of Inductive Generalization

Louis Van Langendonck, Guillermo Bernárdez, Nina Miolane et al.

A fundamental challenge in graph learning is understanding how models generalize to new, unseen graphs. While synthetic benchmarks offer controlled settings for analysis, existing approaches are confined to single-graph, transductive settings where models train and test on the same graph structure. Addressing this gap, we introduce GraphUniverse, a framework for generating entire families of graphs to enable the first systematic evaluation of inductive generalization at scale. Our core innovation is the generation of graphs with persistent semantic communities, ensuring conceptual consistency while allowing fine-grained control over structural properties like homophily and degree distributions. This enables crucial but underexplored robustness tests, such as performance under controlled distribution shifts. Benchmarking a wide range of architectures -- from GNNs to graph transformers and topological architectures -- reveals that strong transductive performance is a poor predictor of inductive generalization. Furthermore, we find that robustness to distribution shift is highly sensitive not only to model architecture choice but also to the initial graph regime (e.g., high vs. low homophily). Beyond benchmarking, GraphUniverse's flexibility and scalability can facilitate the development of robust and truly generalizable architectures -- including next-generation graph foundation models. An interactive demo is available at https://graphuniverse.streamlit.app.

LGJun 19, 2024
PPT-GNN: A Practical Pre-Trained Spatio-Temporal Graph Neural Network for Network Security

Louis Van Langendonck, Ismael Castell-Uroz, Pere Barlet-Ros

Recent works have demonstrated the potential of Graph Neural Networks (GNN) for network intrusion detection. Despite their advantages, a significant gap persists between real-world scenarios, where detection speed is critical, and existing proposals, which operate on large graphs representing several hours of traffic. This gap results in unrealistic operational conditions and impractical detection delays. Moreover, existing models do not generalize well across different networks, hampering their deployment in production environments. To address these issues, we introduce PPTGNN, a practical spatio-temporal GNN for intrusion detection. PPTGNN enables near real-time predictions, while better capturing the spatio-temporal dynamics of network attacks. PPTGNN employs self-supervised pre-training for improved performance and reduced dependency on labeled data. We evaluate PPTGNN on three public datasets and show that it significantly outperforms state-of-the-art models, such as E-ResGAT and E-GraphSAGE, with an average accuracy improvement of 10.38%. Finally, we show that a pre-trained PPTGNN can easily be fine-tuned to unseen networks with minimal labeled examples. This highlights the potential of PPTGNN as a general, large-scale pre-trained model that can effectively operate in diverse network environments.