Topological Machine Learning with Unreduced Persistence Diagrams
This work addresses a computational bottleneck in topological machine learning for researchers and practitioners, but it is incremental as it builds on existing methods with a focus on efficiency.
The paper tackled the problem of supervised machine learning pipelines ignoring much information in persistence diagrams despite their high computational cost, by introducing methods to generate topological feature vectors from unreduced boundary matrices. The result showed that models trained on unreduced diagrams performed on par or even outperformed those on fully-reduced diagrams in some tasks, suggesting potential computational and performance benefits.
Supervised machine learning pipelines trained on features derived from persistent homology have been experimentally observed to ignore much of the information contained in a persistence diagram. Computing persistence diagrams is often the most computationally demanding step in such a pipeline, however. To explore this, we introduce several methods to generate topological feature vectors from unreduced boundary matrices. We compared the performance of pipelines trained on vectorizations of unreduced PDs to vectorizations of fully-reduced PDs across several data and task types. Our results indicate that models trained on PDs built from unreduced diagrams can perform on par and even outperform those trained on fully-reduced diagrams on some tasks. This observation suggests that machine learning pipelines which incorporate topology-based features may benefit in terms of computational cost and performance by utilizing information contained in unreduced boundary matrices.