ML CG LG ATApr 20, 2019

PersLay: A Neural Network Layer for Persistence Diagrams and New Graph Topological Signatures

Mathieu Carrière, Frédéric Chazal, Yuichi Ike, Théo Lacombe, Martin Royer, Yuhei Umeda

arXiv:1904.09378v428.0247 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of integrating topological data into machine learning models for graph analysis, offering a versatile framework that could benefit researchers in data science and graph-based applications, though it builds incrementally on existing vectorization methods.

The authors tackled the problem of using persistence diagrams from Topological Data Analysis as inputs for machine learning by proposing PersLay, a neural network layer that learns vectorizations of persistence diagrams, and demonstrated its effectiveness by achieving competitive classification scores on real-life graph datasets.

Persistence diagrams, the most common descriptors of Topological Data Analysis, encode topological properties of data and have already proved pivotal in many different applications of data science. However, since the (metric) space of persistence diagrams is not Hilbert, they end up being difficult inputs for most Machine Learning techniques. To address this concern, several vectorization methods have been put forward that embed persistence diagrams into either finite-dimensional Euclidean space or (implicit) infinite dimensional Hilbert space with kernels. In this work, we focus on persistence diagrams built on top of graphs. Relying on extended persistence theory and the so-called heat kernel signature, we show how graphs can be encoded by (extended) persistence diagrams in a provably stable way. We then propose a general and versatile framework for learning vectorizations of persistence diagrams, which encompasses most of the vectorization techniques used in the literature. We finally showcase the experimental strength of our setup by achieving competitive scores on classification tasks on real-life graph datasets.

View on arXiv PDF Code

Similar