LGJun 19, 2024

LightGBM robust optimization algorithm based on topological data analysis

arXiv:2406.13300v16 citations
Originality Highly original
AI Analysis

This work addresses robustness issues in image classification for applications like medical imaging and face recognition, but it is incremental as it builds on existing LightGBM with a hybrid feature approach.

The paper tackled the problem of noise interference in image classification by proposing TDA-LightGBM, which integrates topological features with LightGBM, resulting in accuracy improvements of up to 15% on noisy datasets and 99.8% accuracy in noise-free scenarios.

To enhance the robustness of the Light Gradient Boosting Machine (LightGBM) algorithm for image classification, a topological data analysis (TDA)-based robustness optimization algorithm for LightGBM, TDA-LightGBM, is proposed to address the interference of noise on image classification. Initially, the method partitions the feature engineering process into two streams: pixel feature stream and topological feature stream for feature extraction respectively. Subsequently, these pixel and topological features are amalgamated into a comprehensive feature vector, serving as the input for LightGBM in image classification tasks. This fusion of features not only encompasses traditional feature engineering methodologies but also harnesses topological structure information to more accurately encapsulate the intrinsic features of the image. The objective is to surmount challenges related to unstable feature extraction and diminished classification accuracy induced by data noise in conventional image processing. Experimental findings substantiate that TDA-LightGBM achieves a 3% accuracy improvement over LightGBM on the SOCOFing dataset across five classification tasks under noisy conditions. In noise-free scenarios, TDA-LightGBM exhibits a 0.5% accuracy enhancement over LightGBM on two classification tasks, achieving a remarkable accuracy of 99.8%. Furthermore, the method elevates the classification accuracy of the Ultrasound Breast Images for Breast Cancer dataset and the Masked CASIA WebFace dataset by 6% and 15%, respectively, surpassing LightGBM in the presence of noise. These empirical results underscore the efficacy of the TDA-LightGBM approach in fortifying the robustness of LightGBM by integrating topological features, thereby augmenting the performance of image classification tasks amidst data perturbations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes