LGMay 21
Minimum Description Length based Granular-Ball Tree Regularization for Spectral ClusteringZeqiang Xian, Caihui Liu, Yong Zhang et al.
Spectral clustering largely depends on the affinity graph, yet constructing a graph that preserves reliable local connectivity while adapting to heterogeneous data structures remains challenging. Existing granular-ball-based spectral clustering methods usually reduce graph complexity by using coarse-grained representatives. However, the learned local regions are often treated as graph nodes or anchors, and their structural information is not sufficiently used to regularize the original sample-level graph. To address this issue, this paper proposes a Minimum Description Length based Granular-Ball Tree-Regularized Spectral Clustering method, termed MDL-GBTRSC. The proposed method constructs a granular-ball tree through local MDL model selection, with reciprocal neighborhood continuity used to discourage splits that break reliable local connections. The stable leaf balls obtained from the tree provide coding-scale information for regularizing the sample-level affinity graph. In addition, a shared-neighbor bridge code is introduced to adjust weak local bridge relations without requiring an additional user-specified threshold. In this way, MDL-GBTRSC connects interpretable local representation learning with affinity graph construction in a unified spectral clustering framework. Experiments on real and synthetic datasets show that MDL-GBTRSC achieves the best average ARI and NMI under the adopted fixed-configuration protocol compared with classical spectral clustering baselines and representative granular-ball, micro-cluster, and anchor-based methods.
LGMay 12
A Boundary-Aware Non-parametric Granular-Ball Classifier Based on Minimum Description LengthZeqiang Xian, Caihui Liu, Yong Zhang et al.
Existing granular-ball classification methods are often driven by handcrafted quality measures, neighborhood rules, or heuristic splitting and stopping criteria, which may reduce the transparency of local construction decisions and hinder explicit modeling of boundary-sensitive regions. To address this issue, this paper proposes a Minimum Description Length based Granular-Ball Classifier (MDL-GBC), a boundary-aware non-parametric and interpretable granular-ball classifier. MDL-GBC formulates class-conditional granular-ball construction as a local model selection problem under the Minimum Description Length principle. For each class, samples from the target class provide positive class evidence, while samples from the remaining classes provide negative boundary evidence. For each current granular ball, three candidate explanations are compared under a unified description-length criterion: a single-ball model, a two-ball model, and a core-boundary model. The selected model determines whether the ball is retained, geometrically split, or refined into core and boundary-sensitive child balls, thereby making local construction decisions consistent with the MDL-based classification mechanism. During prediction, a class-level mixture coding rule aggregates stable granular balls of the same class and assigns the test sample by comparing class-wise coding costs. Experiments on 18 benchmark datasets show that MDL-GBC achieves competitive classification performance against classical classifiers and representative granular-ball-based methods, obtaining the best average Accuracy, Macro-F1, and average rank. These results indicate that MDL-GBC provides an effective and interpretable alternative to conventional heuristic granular-ball classification strategies.
LGMay 9
MDL-GBG: A Non-parametric and Interpretable Granular-Ball Generation Method for ClusteringZeqiang Xian, Caihui Liu, Yong Zhang et al.
Existing granular-ball generation methods are still mainly driven by handcrafted quality measures and heuristic splitting or stopping criteria, which weakens the transparency of local generation decisions in clustering. To address this issue, this paper proposes Minimum Description Length based Granular-Ball Generation (MDL-GBG), a non-parametric and interpretable granular-ball generation method for clustering. MDL-GBG reformulates granular-ball generation as a local model selection problem under the Minimum Description Length principle. For each granular ball, three candidate explanations are compared, namely a single-ball model, a two-ball model, and a core-ball-plus-residual model, and the model with the shortest description length is selected. In this way, ball retention, splitting, and residual peeling are unified within a common coding-theoretic framework. A residual reassignment mechanism is further introduced to globally re-evaluate peeled-off boundary samples after stable granular-balls are formed. Experiments on 20 UCI datasets show that the stable granular-balls generated by MDL-GBG provide a highly competitive upstream representation for clustering, with MDL-GBG+AC achieving the best overall average ranks in ARI, ACC, and NMI among the compared methods. These results demonstrate that MDL-GBG offers an effective and interpretable alternative to conventional heuristic granular-ball generation strategies.