LGMar 13, 2023Code
Label Information Bottleneck for Label EnhancementQinghai Zheng, Jihua Zhu, Haoyu Tang
In this work, we focus on the challenging problem of Label Enhancement (LE), which aims to exactly recover label distributions from logical labels, and present a novel Label Information Bottleneck (LIB) method for LE. For the recovery process of label distributions, the label irrelevant information contained in the dataset may lead to unsatisfactory recovery performance. To address this limitation, we make efforts to excavate the essential label relevant information to improve the recovery performance. Our method formulates the LE problem as the following two joint processes: 1) learning the representation with the essential label relevant information, 2) recovering label distributions based on the learned representation. The label relevant information can be excavated based on the "bottleneck" formed by the learned representation. Significantly, both the label relevant information about the label assignments and the label relevant information about the label gaps can be explored in our method. Evaluation experiments conducted on several benchmark label distribution learning datasets verify the effectiveness and competitiveness of LIB. Our source codes are available https://github.com/qinghai-zheng/LIBLE
LGFeb 28, 2023
Multi-view Semantic Consistency based Information Bottleneck for ClusteringWenbiao Yan, Jihua Zhu, Yiyang Zhou et al.
Multi-view clustering can make use of multi-source information for unsupervised clustering. Most existing methods focus on learning a fused representation matrix, while ignoring the influence of private information and noise. To address this limitation, we introduce a novel Multi-view Semantic Consistency based Information Bottleneck for clustering (MSCIB). Specifically, MSCIB pursues semantic consistency to improve the learning process of information bottleneck for different views. It conducts the alignment operation of multiple views in the semantic space and jointly achieves the valuable consistent information of multi-view data. In this way, the learned semantic consistency from multi-view data can improve the information bottleneck to more exactly distinguish the consistent information and learn a unified feature representation with more discriminative consistent information for clustering. Experiments on various types of multi-view datasets show that MSCIB achieves state-of-the-art performance.
LGFeb 26, 2023
MCoCo: Multi-level Consistency Collaborative Multi-view ClusteringYiyang Zhou, Qinghai Zheng, Wenbiao Yan et al.
Multi-view clustering can explore consistent information from different views to guide clustering. Most existing works focus on pursuing shallow consistency in the feature space and integrating the information of multiple views into a unified representation for clustering. These methods did not fully consider and explore the consistency in the semantic space. To address this issue, we proposed a novel Multi-level Consistency Collaborative learning framework (MCoCo) for multi-view clustering. Specifically, MCoCo jointly learns cluster assignments of multiple views in feature space and aligns semantic labels of different views in semantic space by contrastive learning. Further, we designed a multi-level consistency collaboration strategy, which utilizes the consistent information of semantic space as a self-supervised signal to collaborate with the cluster assignments in feature space. Thus, different levels of spaces collaborate with each other while achieving their own consistency goals, which makes MCoCo fully mine the consistent information of different views without fusion. Compared with state-of-the-art methods, extensive experiments demonstrate the effectiveness and superiority of our method.
LGMar 8, 2023
Semantically Consistent Multi-view Representation LearningYiyang Zhou, Qinghai Zheng, Shunshun Bai et al.
In this work, we devote ourselves to the challenging task of Unsupervised Multi-view Representation Learning (UMRL), which requires learning a unified feature representation from multiple views in an unsupervised manner. Existing UMRL methods mainly concentrate on the learning process in the feature space while ignoring the valuable semantic information hidden in different views. To address this issue, we propose a novel Semantically Consistent Multi-view Representation Learning (SCMRL), which makes efforts to excavate underlying multi-view semantic consensus information and utilize the information to guide the unified feature representation learning. Specifically, SCMRL consists of a within-view reconstruction module and a unified feature representation learning module, which are elegantly integrated by the contrastive learning strategy to simultaneously align semantic labels of both view-specific feature representations and the learned unified feature representation. In this way, the consensus information in the semantic space can be effectively exploited to constrain the learning process of unified feature representation. Compared with several state-of-the-art algorithms, extensive experiments demonstrate its superiority.
LGFeb 26, 2024
Watch Your Head: Assembling Projection Heads to Save the Reliability of Federated ModelsJinqian Chen, Jihua Zhu, Qinghai Zheng et al.
Federated learning encounters substantial challenges with heterogeneous data, leading to performance degradation and convergence issues. While considerable progress has been achieved in mitigating such an impact, the reliability aspect of federated models has been largely disregarded. In this study, we conduct extensive experiments to investigate the reliability of both generic and personalized federated models. Our exploration uncovers a significant finding: \textbf{federated models exhibit unreliability when faced with heterogeneous data}, demonstrating poor calibration on in-distribution test data and low uncertainty levels on out-of-distribution data. This unreliability is primarily attributed to the presence of biased projection heads, which introduce miscalibration into the federated models. Inspired by this observation, we propose the "Assembled Projection Heads" (APH) method for enhancing the reliability of federated models. By treating the existing projection head parameters as priors, APH randomly samples multiple initialized parameters of projection heads from the prior and further performs targeted fine-tuning on locally available data under varying learning rates. Such a head ensemble introduces parameter diversity into the deterministic model, eliminating the bias and producing reliable predictions via head averaging. We evaluate the effectiveness of the proposed APH method across three prominent federated benchmarks. Experimental results validate the efficacy of APH in model calibration and uncertainty estimation. Notably, APH can be seamlessly integrated into various federated approaches but only requires less than 30\% additional computation cost for 100$\times$ inferences within large models.
LGDec 5, 2023
Towards Fast and Stable Federated Learning: Confronting Heterogeneity via Knowledge AnchorJinqian Chen, Jihua Zhu, Qinghai Zheng
Federated learning encounters a critical challenge of data heterogeneity, adversely affecting the performance and convergence of the federated model. Various approaches have been proposed to address this issue, yet their effectiveness is still limited. Recent studies have revealed that the federated model suffers severe forgetting in local training, leading to global forgetting and performance degradation. Although the analysis provides valuable insights, a comprehensive understanding of the vulnerable classes and their impact factors is yet to be established. In this paper, we aim to bridge this gap by systematically analyzing the forgetting degree of each class during local training across different communication rounds. Our observations are: (1) Both missing and non-dominant classes suffer similar severe forgetting during local training, while dominant classes show improvement in performance. (2) When dynamically reducing the sample size of a dominant class, catastrophic forgetting occurs abruptly when the proportion of its samples is below a certain threshold, indicating that the local model struggles to leverage a few samples of a specific class effectively to prevent forgetting. Motivated by these findings, we propose a novel and straightforward algorithm called Federated Knowledge Anchor (FedKA). Assuming that all clients have a single shared sample for each class, the knowledge anchor is constructed before each local training stage by extracting shared samples for missing classes and randomly selecting one sample per class for non-dominant classes. The knowledge anchor is then utilized to correct the gradient of each mini-batch towards the direction of preserving the knowledge of the missing and non-dominant classes. Extensive experimental results demonstrate that our proposed FedKA achieves fast and stable convergence, significantly improving accuracy on popular benchmarks.
LGJun 28, 2024
Towards Stable and Storage-efficient Dataset Distillation: Matching Convexified TrajectoryWenliang Zhong, Haoyu Tang, Qinghai Zheng et al.
The rapid evolution of deep learning and large language models has led to an exponential growth in the demand for training data, prompting the development of Dataset Distillation methods to address the challenges of managing large datasets. Among these, Matching Training Trajectories (MTT) has been a prominent approach, which replicates the training trajectory of an expert network on real data with a synthetic dataset. However, our investigation found that this method suffers from three significant limitations: 1. Instability of expert trajectory generated by Stochastic Gradient Descent (SGD); 2. Low convergence speed of the distillation process; 3. High storage consumption of the expert trajectory. To address these issues, we offer a new perspective on understanding the essence of Dataset Distillation and MTT through a simple transformation of the objective function, and introduce a novel method called Matching Convexified Trajectory (MCT), which aims to provide better guidance for the student trajectory. MCT leverages insights from the linearized dynamics of Neural Tangent Kernel methods to create a convex combination of expert trajectories, guiding the student network to converge rapidly and stably. This trajectory is not only easier to store, but also enables a continuous sampling strategy during distillation, ensuring thorough learning and fitting of the entire expert trajectory. Comprehensive experiments across three public datasets validate the superiority of MCT over traditional MTT methods.
SDJun 27, 2021
Listen As You Wish: Audio based Event Detection via Text-to-Audio Grounding in Smart CitiesHaoyu Tang, Yunxiao Wang, Jihua Zhu et al.
With the development of internet of things technologies, tremendous sensor audio data has been produced, which poses great challenges to audio-based event detection in smart cities. In this paper, we target a challenging audio-based event detection task, namely, text-to-audio grounding. In addition to precisely localizing all of the desired on- and off-sets in the untrimmed audio, this challenging new task requires extensive acoustic and linguistic comprehension as well as the reasoning for the crossmodal matching relations between the audio and query. The current approaches often treat the query as an entire one through a global query representation in order to address those issues. We contend that this strategy has several drawbacks. Firstly, the interactions between the query and the audio are not fully utilized. Secondly, it has not distinguished the importance of different keywords in a query. In addition, since the audio clips are of arbitrary lengths, there exist many segments which are irrelevant to the query but have not been filtered out in the approach. This further hinders the effective grounding of desired segments. Motivated by the above concerns, a novel Cross-modal Graph Interaction (CGI) model is proposed to comprehensively model the relations between the words in a query through a novel language graph. To capture the fine-grained relevances between the audio and query, a cross-modal attention module is introduced to generate snippet-specific query representations and automatically assign higher weights to keywords with more important semantics. Furthermore, we develop a cross-gating module for the audio and query to weaken irrelevant parts and emphasize the important ones.
LGOct 19, 2020
Multi-view Subspace Clustering Networks with Local and Global Graph InformationQinghai Zheng, Jihua Zhu, Yuanyuan Ma et al.
This study investigates the problem of multi-view subspace clustering, the goal of which is to explore the underlying grouping structure of data collected from different fields or measurements. Since data do not always comply with the linear subspace models in many real-world applications, most existing multi-view subspace clustering methods that based on the shallow linear subspace models may fail in practice. Furthermore, underlying graph information of multi-view data is always ignored in most existing multi-view subspace clustering methods. To address aforementioned limitations, we proposed the novel multi-view subspace clustering networks with local and global graph information, termed MSCNLG, in this paper. Specifically, autoencoder networks are employed on multiple views to achieve latent smooth representations that are suitable for the linear assumption. Simultaneously, by integrating fused multi-view graph information into self-expressive layers, the proposed MSCNLG obtains the common shared multi-view subspace representation, which can be used to get clustering results by employing the standard spectral clustering algorithm. As an end-to-end trainable framework, the proposed method fully investigates the valuable information of multiple views. Comprehensive experiments on six benchmark datasets validate the effectiveness and superiority of the proposed MSCNLG.
LGOct 19, 2020
Tensor-based Intrinsic Subspace Representation Learning for Multi-view ClusteringQinghai Zheng, Yu Zhang, Jihua Zhu et al.
As a hot research topic, many multi-view clustering approaches are proposed over the past few years. Nevertheless, most existing algorithms merely take the consensus information among different views into consideration for clustering. Actually, it may hinder the multi-view clustering performance in real-life applications, since different views usually contain diverse statistic properties. To address this problem, we propose a novel Tensor-based Intrinsic Subspace Representation Learning (TISRL) for multi-view clustering in this paper. Concretely, the rank preserving decomposition is proposed firstly to effectively deal with the diverse statistic information contained in different views. Then, to achieve the intrinsic subspace representation, the tensor-singular value decomposition based low-rank tensor constraint is also utilized in our method. It can be seen that specific information contained in different views is fully investigated by the rank preserving decomposition, and the high-order correlations of multi-view data are also mined by the low-rank tensor constraint. The objective function can be optimized by an augmented Lagrangian multiplier based alternating direction minimization algorithm. Experimental results on nine common used real-world multi-view datasets illustrate the superiority of TISRL.
LGOct 15, 2020
Multi-view Hierarchical ClusteringQinghai Zheng, Jihua Zhu, Shuangxun Ma
This paper focuses on the multi-view clustering, which aims to promote clustering results with multi-view data. Usually, most existing works suffer from the issues of parameter selection and high computational complexity. To overcome these limitations, we propose a Multi-view Hierarchical Clustering (MHC), which partitions multi-view data recursively at multiple levels of granularity. Specifically, MHC consists of two important components: the cosine distance integration (CDI) and the nearest neighbor agglomeration (NNA). The CDI can explore the underlying complementary information of multi-view data so as to learn an essential distance matrix, which is utilized in NNA to obtain the clustering results. Significantly, the proposed MHC can be easily and effectively employed in real-world applications without parameter selection. Experiments on nine benchmark datasets illustrate the superiority of our method comparing to several state-of-the-art multi-view clustering methods.
LGJul 7, 2020
Bidirectional Loss Function for Label Enhancement and Distribution LearningXinyuan Liu, Jihua Zhu, Qinghai Zheng et al.
Label distribution learning (LDL) is an interpretable and general learning paradigm that has been applied in many real-world applications. In contrast to the simple logical vector in single-label learning (SLL) and multi-label learning (MLL), LDL assigns labels with a description degree to each instance. In practice, two challenges exist in LDL, namely, how to address the dimensional gap problem during the learning process of LDL and how to exactly recover label distributions from existing logical labels, i.e., Label Enhancement (LE). For most existing LDL and LE algorithms, the fact that the dimension of the input matrix is much higher than that of the output one is alway ignored and it typically leads to the dimensional reduction owing to the unidirectional projection. The valuable information hidden in the feature space is lost during the mapping process. To this end, this study considers bidirectional projections function which can be applied in LE and LDL problems simultaneously. More specifically, this novel loss function not only considers the mapping errors generated from the projection of the input space into the output one but also accounts for the reconstruction errors generated from the projection of the output space back to the input one. This loss function aims to potentially reconstruct the input data from the output data. Therefore, it is expected to obtain more accurate results. Finally, experiments on several real-world datasets are carried out to demonstrate the superiority of the proposed method for both LE and LDL.
LGApr 7, 2020
Consistent and Complementary Graph Regularized Multi-view Subspace ClusteringQinghai Zheng, Jihua Zhu, Zhongyu Li et al.
This study investigates the problem of multi-view clustering, where multiple views contain consistent information and each view also includes complementary information. Exploration of all information is crucial for good multi-view clustering. However, most traditional methods blindly or crudely combine multiple views for clustering and are unable to fully exploit the valuable information. Therefore, we propose a method that involves consistent and complementary graph-regularized multi-view subspace clustering (GRMSC), which simultaneously integrates a consistent graph regularizer with a complementary graph regularizer into the objective function. In particular, the consistent graph regularizer learns the intrinsic affinity relationship of data points shared by all views. The complementary graph regularizer investigates the specific information of multiple views. It is noteworthy that the consistent and complementary regularizers are formulated by two different graphs constructed from the first-order proximity and second-order proximity of multiple views, respectively. The objective function is optimized by the augmented Lagrangian multiplier method in order to achieve multi-view clustering. Extensive experiments on six benchmark datasets serve to validate the effectiveness of the proposed method over other state-of-the-art multi-view clustering methods.
LGApr 7, 2020
Generalized Label Enhancement with Sample CorrelationsQinghai Zheng, Jihua Zhu, Haoyu Tang et al.
Recently, label distribution learning (LDL) has drawn much attention in machine learning, where LDL model is learned from labelel instances. Different from single-label and multi-label annotations, label distributions describe the instance by multiple labels with different intensities and accommodate to more general scenes. Since most existing machine learning datasets merely provide logical labels, label distributions are unavailable in many real-world applications. To handle this problem, we propose two novel label enhancement methods, i.e., Label Enhancement with Sample Correlations (LESC) and generalized Label Enhancement with Sample Correlations (gLESC). More specifically, LESC employs a low-rank representation of samples in the feature space, and gLESC leverages a tensor multi-rank minimization to further investigate the sample correlations in both the feature space and label space. Benefitting from the sample correlations, the proposed methods can boost the performance of label enhancement. Extensive experiments on 14 benchmark datasets demonstrate the effectiveness and superiority of our methods.
LGJun 19, 2019
Constrained Bilinear Factorization Multi-view Subspace ClusteringQinghai Zheng, Jihua Zhu, Zhiqiang Tian et al.
Multi-view clustering is an important and fundamental problem. Many multi-view subspace clustering methods have been proposed, and most of them assume that all views share a same coefficient matrix. However, the underlying information of multi-view data are not fully exploited under this assumption, since the coefficient matrices of different views should have the same clustering properties rather than be uniform among multiple views. To this end, this paper proposes a novel Constrained Bilinear Factorization Multi-view Subspace Clustering (CBF-MSC) method. Specifically, the bilinear factorization with an orthonormality constraint and a low-rank constraint is imposed for all coefficient matrices to make them have the same trace-norm instead of being equivalent, so as to explore the consensus information of multi-view data more fully. Finally, an Augmented Lagrangian Multiplier (ALM) based algorithm is designed to optimize the objective function. Comprehensive experiments tested on nine benchmark datasets validate the effectiveness and competitiveness of the proposed approach compared with several state-of-the-arts.
LGJan 30, 2019
Feature Concatenation Multi-view Subspace ClusteringQinghai Zheng, Jihua Zhu, Zhongyu Li et al.
Multi-view clustering is a learning paradigm based on multi-view data. Since statistic properties of different views are diverse, even incompatible, few approaches implement multi-view clustering based on the concatenated features straightforward. However, feature concatenation is a natural way to combine multi-view data. To this end, this paper proposes a novel multi-view subspace clustering approach dubbed Feature Concatenation Multi-view Subspace Clustering (FCMSC), which boosts the clustering performance by exploring the consensus information of multi-view data. Specifically, multi-view data are concatenated into a joint representation firstly, then, $l_{2,1}$-norm is integrated into the objective function to deal with the sample-specific and cluster-specific corruptions of multiple views. Moreover, a graph regularized FCMSC is also proposed in this paper to explore both the consensus information and complementary information of multi-view data for clustering. It is noteworthy that the obtained coefficient matrix is not derived by simply applying the Low-Rank Representation (LRR) to concatenated features directly. Finally, an effective algorithm based on the Augmented Lagrangian Multiplier (ALM) is designed to optimize the objective functions. Comprehensive experiments on six real-world datasets illustrate the superiority of the proposed methods over several state-of-the-art approaches for multi-view clustering.