Jan von Pichowski

3papers

1citation

Novelty53%

AI Score37

Ranked #112,737 of 201,326 authors (top 56%)#25,009 in LG (top 59%)

3 Papers

LGMay 7

The Role of Node Features in Graph Pooling

Jan von Pichowski, Alžbeta Hrabošová, Ingo Scholtes et al.

Graph pooling is commonly applied in graph classification, yet its empirical gains over standard WL-1 expressive GNNs are often marginal or inconsistent. We study this gap by analysing the interaction between node features and graph topology and their effect on pooling objectives. Our analysis reveals that pooling operators require node features that are well-aligned with the graph's topology -- a condition often overlooked and not guaranteed in empirical networks. We formalise fundamental requirements for node features to enable effective pooling, and introduce a quantitative measure of feature quality. Our empirical evaluation shows that, when these requirements are satisfied, pooling can be beneficial and improve performance on appropriate datasets.

LGSep 16, 2024

MDL-Pool: Adaptive Multilevel Graph Pooling Based on Minimum Description Length

Jan von Pichowski, Christopher Blöcker, Ingo Scholtes

Graph pooling compresses graphs and summarises their topological properties and features in a vectorial representation. It is an essential part of deep graph representation learning and is indispensable in graph-level tasks like classification or regression. Current approaches pool hierarchical structures in graphs by iteratively applying shallow pooling operators up to a fixed depth. However, they disregard the interdependencies between structures at different hierarchical levels and do not adapt to datasets that contain graphs with different sizes that may require pooling with various depths. To address these issues, we propose MDL-Pool, a pooling operator based on the minimum description length (MDL) principle, whose loss formulation explicitly models the interdependencies between different hierarchical levels and facilitates a direct comparison between multiple pooling alternatives with different depths. MDP-Pool builds on the map equation, an information-theoretic objective function for community detection, which naturally implements Occam's razor and balances between model complexity and goodness-of-fit via the MDL. We demonstrate MDL-Pool's competitive performance in an empirical evaluation against various baselines across standard graph classification datasets.

LGJun 24, 2024

Inference of Sequential Patterns for Neural Message Passing in Temporal Graphs

Jan von Pichowski, Vincenzo Perri, Lisi Qarkaxhija et al.

The modelling of temporal patterns in dynamic graphs is an important current research issue in the development of time-aware GNNs. Whether or not a specific sequence of events in a temporal graph constitutes a temporal pattern not only depends on the frequency of its occurrence. We consider whether it deviates from what is expected in a temporal graph where timestamps are randomly shuffled. While accounting for such a random baseline is important to model temporal patterns, it has mostly been ignored by current temporal graph neural networks. To address this issue we propose HYPA-DBGNN, a novel two-step approach that combines (i) the inference of anomalous sequential patterns in time series data on graphs based on a statistically principled null model, with (ii) a neural message passing approach that utilizes a higher-order De Bruijn graph whose edges capture overrepresented sequential patterns. Our method leverages hypergeometric graph ensembles to identify anomalous edges within both first- and higher-order De Bruijn graphs, which encode the temporal ordering of events. The model introduces an inductive bias that enhances model interpretability. We evaluate our approach for static node classification using benchmark datasets and a synthetic dataset that showcases its ability to incorporate the observed inductive bias regarding over- and under-represented temporal edges. We demonstrate the framework's effectiveness in detecting similar patterns within empirical datasets, resulting in superior performance compared to baseline methods in node classification tasks. To the best of our knowledge, our work is the first to introduce statistically informed GNNs that leverage temporal and causal sequence anomalies. HYPA-DBGNN represents a path for bridging the gap between statistical graph inference and neural graph representation learning, with potential applications to static GNNs.