Comparing the Effects of Persistence Barcodes Aggregation and Feature Concatenation on Medical Imaging
This work addresses feature engineering challenges in medical image analysis by evaluating topological data analysis methods, though it is incremental as it compares existing techniques rather than introducing new ones.
The study compared two methods for constructing topological feature vectors from persistence barcodes in medical imaging: aggregation followed by featurization versus concatenation of feature vectors. Results showed that feature concatenation preserved more detailed topological information and led to better classification performance, making it the preferred approach.
In medical image analysis, feature engineering plays an important role in the design and performance of machine learning models. Persistent homology (PH), from the field of topological data analysis (TDA), demonstrates robustness and stability to data perturbations and addresses the limitation from traditional feature extraction approaches where a small change in input results in a large change in feature representation. Using PH, we store persistent topological and geometrical features in the form of the persistence barcode whereby large bars represent global topological features and small bars encapsulate geometrical information of the data. When multiple barcodes are computed from 2D or 3D medical images, two approaches can be used to construct the final topological feature vector in each dimension: aggregating persistence barcodes followed by featurization or concatenating topological feature vectors derived from each barcode. In this study, we conduct a comprehensive analysis across diverse medical imaging datasets to compare the effects of the two aforementioned approaches on the performance of classification models. The results of this analysis indicate that feature concatenation preserves detailed topological information from individual barcodes, yields better classification performance and is therefore a preferred approach when conducting similar experiments.