IRLGAug 29, 2023

Dimensionality Reduction Using pseudo-Boolean polynomials For Cluster Analysis

arXiv:2308.15553v13 citationsh-index: 22
Originality Incremental advance
AI Analysis

This work addresses the challenge of high-dimensional data clustering for data scientists, though it appears incremental as it builds on existing pseudo-Boolean polynomial methods.

The paper tackles the problem of dimensionality reduction for cluster analysis by using pseudo-Boolean polynomials, achieving reductions such as from 4D to 2D on the Iris dataset and 30D to 3D on the WDBC dataset while maintaining competitive clustering accuracies.

We introduce usage of a reduction property of penalty-based formulation of pseudo-Boolean polynomials as a mechanism for invariant dimensionality reduction in cluster analysis processes. In our experiments, we show that multidimensional data, like 4-dimensional Iris Flower dataset can be reduced to 2-dimensional space while the 30-dimensional Wisconsin Diagnostic Breast Cancer (WDBC) dataset can be reduced to 3-dimensional space, and by searching lines or planes that lie between reduced samples we can extract clusters in a linear and unbiased manner with competitive accuracies, reproducibility and clear interpretation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes