LGDec 12, 2024
Multi-view Clustering via Unified Multi-kernel Learning and Matrix FactorizationChenxing Jia, Mingjie Cai, Hamido Fujita
Multi-view clustering has become increasingly important due to the multi-source character of real-world data. Among existing multi-view clustering methods, multi-kernel clustering and matrix factorization-based multi-view clustering have gained widespread attention as mainstream approaches. However, multi-kernel clustering tends to learn an optimal kernel and then perform eigenvalue decomposition on it, which leads to high computational complexity. Matrix factorization-based multi-view clustering methods impose orthogonal constraints on individual views. This overly emphasizes the accuracy of clustering structures within single views and restricts the learning of individual views. Based on this analysis, we propose a multi-view clustering method that integrates multi-kernel learning with matrix factorization. This approach combines the advantages of both multi-kernel learning and matrix factorization. It removes the orthogonal constraints on individual views and imposes orthogonal constraints on the consensus matrix, resulting in an accurate final clustering structure. Ultimately, the method is unified into a simple form of multi-kernel clustering, but avoids learning an optimal kernel, thus reducing the time complexity. Furthermore, we propose an efficient three-step optimization algorithm to achieve a locally optimal solution. Experiments on widely-used real-world datasets demonstrate the effectiveness of our proposed method.
LGJan 22, 2025
Low-Dimensional Representation-Driven TSK Fuzzy System for Feature SelectionQiong Liu, Mingjie Cai, Qingguo Li
Feature selection can select important features to address dimensional curses. Subspace learning, a widely used dimensionality reduction method, can project the original data into a low-dimensional space. However, the low-dimensional representation is often transformed back into the original space, resulting in information loss. Additionally, gate function-based methods in Takagi-Sugeno-Kang fuzzy system (TSK-FS) are commonly less discrimination. To address these issues, this paper proposes a novel feature selection method that integrates subspace learning with TSK-FS. Specifically, a projection matrix is used to fit the intrinsic low-dimensional representation. Subsequently, the low-dimensional representation is fed to TSK-FS to measure its availability. The firing strength is slacked so that TSK-FS is not limited by numerical underflow. Finally, the $\ell _{2,1}$-norm is introduced to select significant features and the connection to related works is discussed. The proposed method is evaluated against six state-of-the-art methods on eighteen datasets, and the results demonstrate the superiority of the proposed method.
LGMay 20, 2023
GFDC: A Granule Fusion Density-Based Clustering with Evidential ReasoningMingjie Cai, Zhishan Wu, Qingguo Li et al.
Currently, density-based clustering algorithms are widely applied because they can detect clusters with arbitrary shapes. However, they perform poorly in measuring global density, determining reasonable cluster centers or structures, assigning samples accurately and handling data with large density differences among clusters. To overcome their drawbacks, this paper proposes a granule fusion density-based clustering with evidential reasoning (GFDC). Both local and global densities of samples are measured by a sparse degree metric first. Then information granules are generated in high-density and low-density regions, assisting in processing clusters with significant density differences. Further, three novel granule fusion strategies are utilized to combine granules into stable cluster structures, helping to detect clusters with arbitrary shapes. Finally, by an assignment method developed from Dempster-Shafer theory, unstable samples are assigned. After using GFDC, a reasonable clustering result and some identified outliers can be obtained. The experimental results on extensive datasets demonstrate the effectiveness of GFDC.
LGMay 19, 2023
Three-way Imbalanced Learning based on Fuzzy Twin SVMWanting Cai, Mingjie Cai, Qingguo Li et al.
Three-way decision (3WD) is a powerful tool for granular computing to deal with uncertain data, commonly used in information systems, decision-making, and medical care. Three-way decision gets much research in traditional rough set models. However, three-way decision is rarely combined with the currently popular field of machine learning to expand its research. In this paper, three-way decision is connected with SVM, a standard binary classification model in machine learning, for solving imbalanced classification problems that SVM needs to improve. A new three-way fuzzy membership function and a new fuzzy twin support vector machine with three-way membership (TWFTSVM) are proposed. The new three-way fuzzy membership function is defined to increase the certainty of uncertain data in both input space and feature space, which assigns higher fuzzy membership to minority samples compared with majority samples. To evaluate the effectiveness of the proposed model, comparative experiments are designed for forty-seven different datasets with varying imbalance ratios. In addition, datasets with different imbalance ratios are derived from the same dataset to further assess the proposed model's performance. The results show that the proposed model significantly outperforms other traditional SVM-based methods.
ITApr 12, 2015
Knowledge reduction of dynamic covering decision information systems with varying attribute valuesMingjie Cai
Knowledge reduction of dynamic covering information systems involves with the time in practical situations. In this paper, we provide incremental approaches to computing the type-1 and type-2 characteristic matrices of dynamic coverings because of varying attribute values. Then we present incremental algorithms of constructing the second and sixth approximations of sets by using characteristic matrices. We employ experimental results to illustrate that the incremental approaches are effective to calculate approximations of sets in dynamic covering information systems. Finally, we perform knowledge reduction of dynamic covering information systems with the incremental approaches.