PretopoMD: Pretopology-based Mixed Data Hierarchical Clustering
This work addresses the challenge of clustering heterogeneous datasets for data scientists by offering a customizable and interpretable method, though it appears incremental as it builds on existing pretopology concepts.
The authors tackled the problem of clustering mixed data without dimensionality reduction by developing a pretopology-based algorithm using logical rules, which demonstrated superior performance in accurately and interpretably delineating clusters from raw data.
This article presents a novel pretopology-based algorithm designed to address the challenges of clustering mixed data without the need for dimensionality reduction. Leveraging Disjunctive Normal Form, our approach formulates customizable logical rules and adjustable hyperparameters that allow for user-defined hierarchical cluster construction and facilitate tailored solutions for heterogeneous datasets. Through hierarchical dendrogram analysis and comparative clustering metrics, our method demonstrates superior performance by accurately and interpretably delineating clusters directly from raw data, thus preserving data integrity. Empirical findings highlight the algorithm's robustness in constructing meaningful clusters and reveal its potential in overcoming issues related to clustered data explainability. The novelty of this work lies in its departure from traditional dimensionality reduction techniques and its innovative use of logical rules that enhance both cluster formation and clarity, thereby contributing a significant advancement to the discourse on clustering mixed data.