A computationally efficient framework for vector representation of persistence diagrams
This work addresses the limitation of persistence diagrams in machine learning applications by providing a more efficient vector representation, which is incremental in nature.
The authors tackled the problem of representing persistence diagrams as vectors for machine learning tasks, proposing a computationally efficient framework called vectorized persistence block (VPB) that demonstrates effectiveness in clustering, classification, and change point detection with low computational cost.
In Topological Data Analysis, a common way of quantifying the shape of data is to use a persistence diagram (PD). PDs are multisets of points in $\mathbb{R}^2$ computed using tools of algebraic topology. However, this multi-set structure limits the utility of PDs in applications. Therefore, in recent years efforts have been directed towards extracting informative and efficient summaries from PDs to broaden the scope of their use for machine learning tasks. We propose a computationally efficient framework to convert a PD into a vector in $\mathbb{R}^n$, called a vectorized persistence block (VPB). We show that our representation possesses many of the desired properties of vector-based summaries such as stability with respect to input noise, low computational cost and flexibility. Through simulation studies, we demonstrate the effectiveness of VPBs in terms of performance and computational cost within various learning tasks, namely clustering, classification and change point detection.