CVMar 6, 2023
Adaptive Texture Filtering for Single-Domain Generalized SegmentationXinhui Li, Mingjia Li, Yaxing Wang et al.
Domain generalization in semantic segmentation aims to alleviate the performance degradation on unseen domains through learning domain-invariant features. Existing methods diversify images in the source domain by adding complex or even abnormal textures to reduce the sensitivity to domain specific features. However, these approaches depend heavily on the richness of the texture bank, and training them can be time-consuming. In contrast to importing textures arbitrarily or augmenting styles randomly, we focus on the single source domain itself to achieve generalization. In this paper, we present a novel adaptive texture filtering mechanism to suppress the influence of texture without using augmentation, thus eliminating the interference of domain-specific features. Further, we design a hierarchical guidance generalization network equipped with structure-guided enhancement modules, which purpose is to learn the domain-invariant generalized knowledge. Extensive experiments together with ablation studies on widely-used datasets are conducted to verify the effectiveness of the proposed model, and reveal its superiority over other state-of-the-art alternatives.
CVMar 14, 2022
Bures Joint Distribution Alignment with Dynamic Margin for Unsupervised Domain AdaptationYong-Hui Liu, Chuan-Xian Ren, Xiao-Lin Xu et al.
Unsupervised domain adaptation (UDA) is one of the prominent tasks of transfer learning, and it provides an effective approach to mitigate the distribution shift between the labeled source domain and the unlabeled target domain. Prior works mainly focus on aligning the marginal distributions or the estimated class-conditional distributions. However, the joint dependency among the feature and the label is crucial for the adaptation task and is not fully exploited. To address this problem, we propose the Bures Joint Distribution Alignment (BJDA) algorithm which directly models the joint distribution shift based on the optimal transport theory in the infinite-dimensional kernel spaces. Specifically, we propose a novel alignment loss term that minimizes the kernel Bures-Wasserstein distance between the joint distributions. Technically, BJDA can effectively capture the nonlinear structures underlying the data. In addition, we introduce a dynamic margin in contrastive learning phase to flexibly characterize the class separability and improve the discriminative ability of representations. It also avoids the cross-validation procedure to determine the margin parameter in traditional triplet loss based methods. Extensive experiments show that BJDA is very effective for the UDA tasks, as it outperforms state-of-the-art algorithms in most experimental settings. In particular, BJDA improves the average accuracy of UDA tasks by 2.8% on Adaptiope, 1.4% on Office-Caltech10, and 1.1% on ImageCLEF-DA.
LGAug 13, 2024
COD: Learning Conditional Invariant Representation for Domain Adaptation RegressionHao-Ran Yang, Chuan-Xian Ren, You-Wei Luo
Aiming to generalize the label knowledge from a source domain with continuous outputs to an unlabeled target domain, Domain Adaptation Regression (DAR) is developed for complex practical learning problems. However, due to the continuity problem in regression, existing conditional distribution alignment theory and methods with discrete prior, which are proven to be effective in classification settings, are no longer applicable. In this work, focusing on the feasibility problems in DAR, we establish the sufficiency theory for the regression model, which shows the generalization error can be sufficiently dominated by the cross-domain conditional discrepancy. Further, to characterize conditional discrepancy with continuous conditioning variable, a novel Conditional Operator Discrepancy (COD) is proposed, which admits the metric property on conditional distributions via the kernel embedding theory. Finally, to minimize the discrepancy, a COD-based conditional invariant representation learning model is proposed, and the reformulation is derived to show that reasonable modifications on moment statistics can further improve the discriminability of the adaptation model. Extensive experiments on standard DAR datasets verify the validity of theoretical results and the superiority over SOTA DAR methods.
38.4AIMay 7
Locality-aware Private Class Identification for Domain Adaptation with Extreme Label ShiftChuan-Xian Ren, Cheng-Jun Guo, Hong Yan
Domain adaptation aims to transfer knowledge from a labeled source domain to an unlabeled target domain with different distributions. In real-world scenarios, the label spaces of the two domains often have an inclusion relationship, where some classes exist only in one domain but not the other. These non-overlapping classes are referred to as private classes. Identifying private class samples and mitigating their adverse effects is critical in the literature. Existing methods rely on the assumption that shifts in private classes are large enough to be considered outliers. However, the variance within a single shared class can be significantly larger than the difference between a private class and another shared class, challenging this assumption. Consequently, private classes substantially increase the difficulty of cross-domain classification. To address these issues, based on local transportation and metric properties of optimal transport (OT), a locality-aware private class identification approach is proposed in the form of a score function on transport mass. The effectiveness of the proposed approach is theoretically proven, highlighting the score function's strong ability to distinguish between shared and private class samples. Building on this, we introduce a reliable OT-based method (ReOT) for domain adaptation under severe label shift. ReOT minimizes classification risk while learning the separated cluster structure between the identified shared classes and private classes, effectively avoiding mismatch between shared-private sample pairs, thus ensuring that important knowledge is reliably transported intra-class to mitigate class-conditional discrepancy. Furthermore, a generalization upper bound of the target risk is provided for extreme label shift scenarios, which can be minimized by ReOT. Extensive experiments on benchmarks validate the effectiveness of ReOT.
CVAug 13, 2025
Multi-Sequence Parotid Gland Lesion Segmentation via Expert Text-Guided Segment Anything ModelZhongyuan Wu, Chuan-Xian Ren, Yu Wang et al.
Parotid gland lesion segmentation is essential for the treatment of parotid gland diseases. However, due to the variable size and complex lesion boundaries, accurate parotid gland lesion segmentation remains challenging. Recently, the Segment Anything Model (SAM) fine-tuning has shown remarkable performance in the field of medical image segmentation. Nevertheless, SAM's interaction segmentation model relies heavily on precise lesion prompts (points, boxes, masks, etc.), which are very difficult to obtain in real-world applications. Besides, current medical image segmentation methods are automatically generated, ignoring the domain knowledge of medical experts when performing segmentation. To address these limitations, we propose the parotid gland segment anything model (PG-SAM), an expert diagnosis text-guided SAM incorporating expert domain knowledge for cross-sequence parotid gland lesion segmentation. Specifically, we first propose an expert diagnosis report guided prompt generation module that can automatically generate prompt information containing the prior domain knowledge to guide the subsequent lesion segmentation process. Then, we introduce a cross-sequence attention module, which integrates the complementary information of different modalities to enhance the segmentation effect. Finally, the multi-sequence image features and generated prompts are feed into the decoder to get segmentation result. Experimental results demonstrate that PG-SAM achieves state-of-the-art performance in parotid gland lesion segmentation across three independent clinical centers, validating its clinical applicability and the effectiveness of diagnostic text for enhancing image segmentation in real-world clinical settings.
CVAug 13, 2025
CWFBind: Geometry-Awareness for Fast and Accurate Protein-Ligand DockingLiyan Jia, Chuan-Xian Ren, Hong Yan
Accurately predicting the binding conformation of small-molecule ligands to protein targets is a critical step in rational drug design. Although recent deep learning-based docking surpasses traditional methods in speed and accuracy, many approaches rely on graph representations and language model-inspired encoders while neglecting critical geometric information, resulting in inaccurate pocket localization and unrealistic binding conformations. In this study, we introduce CWFBind, a weighted, fast, and accurate docking method based on local curvature features. Specifically, we integrate local curvature descriptors during the feature extraction phase to enrich the geometric representation of both proteins and ligands, complementing existing chemical, sequence, and structural features. Furthermore, we embed degree-aware weighting mechanisms into the message passing process, enhancing the model's ability to capture spatial structural distinctions and interaction strengths. To address the class imbalance challenge in pocket prediction, CWFBind employs a ligand-aware dynamic radius strategy alongside an enhanced loss function, facilitating more precise identification of binding regions and key residues. Comprehensive experimental evaluations demonstrate that CWFBind achieves competitive performance across multiple docking benchmarks, offering a balanced trade-off between accuracy and efficiency.
LGJul 27, 2025
Partial Domain Adaptation via Importance Sampling-based Shift CorrectionCheng-Jun Guo, Chuan-Xian Ren, You-Wei Luo et al.
Partial domain adaptation (PDA) is a challenging task in real-world machine learning scenarios. It aims to transfer knowledge from a labeled source domain to a related unlabeled target domain, where the support set of the source label distribution subsumes the target one. Previous PDA works managed to correct the label distribution shift by weighting samples in the source domain. However, the simple reweighing technique cannot explore the latent structure and sufficiently use the labeled data, and then models are prone to over-fitting on the source domain. In this work, we propose a novel importance sampling-based shift correction (IS$^2$C) method, where new labeled data are sampled from a built sampling domain, whose label distribution is supposed to be the same as the target domain, to characterize the latent structure and enhance the generalization ability of the model. We provide theoretical guarantees for IS$^2$C by proving that the generalization error can be sufficiently dominated by IS$^2$C. In particular, by implementing sampling with the mixture distribution, the extent of shift between source and sampling domains can be connected to generalization error, which provides an interpretable way to build IS$^2$C. To improve knowledge transfer, an optimal transport-based independence criterion is proposed for conditional distribution alignment, where the computation of the criterion can be adjusted to reduce the complexity from $\mathcal{O}(n^3)$ to $\mathcal{O}(n^2)$ in realistic PDA scenarios. Extensive experiments on PDA benchmarks validate the theoretical results and demonstrate the effectiveness of our IS$^2$C over existing methods.
LGMay 19, 2025
Bi-level Unbalanced Optimal Transport for Partial Domain AdaptationZi-Ying Chen, Chuan-Xian Ren, Hong Yan
Partial domain adaptation (PDA) problem requires aligning cross-domain samples while distinguishing the outlier classes for accurate knowledge transfer. The widely used weighting framework tries to address the outlier classes by introducing the reweighed source domain with a similar label distribution to the target domain. However, the empirical modeling of weights can only characterize the sample-wise relations, which leads to insufficient exploration of cluster structures, and the weights could be sensitive to the inaccurate prediction and cause confusion on the outlier classes. To tackle these issues, we propose a Bi-level Unbalanced Optimal Transport (BUOT) model to simultaneously characterize the sample-wise and class-wise relations in a unified transport framework. Specifically, a cooperation mechanism between sample-level and class-level transport is introduced, where the sample-level transport provides essential structure information for the class-level knowledge transfer, while the class-level transport supplies discriminative information for the outlier identification. The bi-level transport plan provides guidance for the alignment process. By incorporating the label-aware transport cost, the local transport structure is ensured and a fast computation formulation is derived to improve the efficiency. Extensive experiments on benchmark datasets validate the competitiveness of BUOT.
CVMay 19, 2025
A Generalized Label Shift Perspective for Cross-Domain Gaze EstimationHao-Ran Yang, Xiaohui Chen, Chuan-Xian Ren
Aiming to generalize the well-trained gaze estimation model to new target domains, Cross-domain Gaze Estimation (CDGE) is developed for real-world application scenarios. Existing CDGE methods typically extract the domain-invariant features to mitigate domain shift in feature space, which is proved insufficient by Generalized Label Shift (GLS) theory. In this paper, we introduce a novel GLS perspective to CDGE and modelize the cross-domain problem by label and conditional shift problem. A GLS correction framework is presented and a feasible realization is proposed, in which a importance reweighting strategy based on truncated Gaussian distribution is introduced to overcome the continuity challenges in label shift correction. To embed the reweighted source distribution to conditional invariant learning, we further derive a probability-aware estimation of conditional operator discrepancy. Extensive experiments on standard CDGE tasks with different backbone models validate the superior generalization capability across domain and applicability on various models of proposed method.
CVMay 3, 2025
Soft-Masked Semi-Dual Optimal Transport for Partial Domain AdaptationYi-Ming Zhai, Chuan-Xian Ren, Hong Yan
Visual domain adaptation aims to learn discriminative and domain-invariant representation for an unlabeled target domain by leveraging knowledge from a labeled source domain. Partial domain adaptation (PDA) is a general and practical scenario in which the target label space is a subset of the source one. The challenges of PDA exist due to not only domain shift but also the non-identical label spaces of domains. In this paper, a Soft-masked Semi-dual Optimal Transport (SSOT) method is proposed to deal with the PDA problem. Specifically, the class weights of domains are estimated, and then a reweighed source domain is constructed, which is favorable in conducting class-conditional distribution matching with the target domain. A soft-masked transport distance matrix is constructed by category predictions, which will enhance the class-oriented representation ability of optimal transport in the shared feature space. To deal with large-scale optimal transport problems, the semi-dual formulation of the entropy-regularized Kantorovich problem is employed since it can be optimized by gradient-based algorithms. Further, a neural network is exploited to approximate the Kantorovich potential due to its strong fitting ability. This network parametrization also allows the generalization of the dual variable outside the supports of the input distribution. The SSOT model is built upon neural networks, which can be optimized alternately in an end-to-end manner. Extensive experiments are conducted on four benchmark datasets to demonstrate the effectiveness of SSOT.
LGMay 2, 2025
A Physics-preserved Transfer Learning Method for Differential EquationsHao-Ran Yang, Chuan-Xian Ren
While data-driven methods such as neural operator have achieved great success in solving differential equations (DEs), they suffer from domain shift problems caused by different learning environments (with data bias or equation changes), which can be alleviated by transfer learning (TL). However, existing TL methods adopted in DEs problems lack either generalizability in general DEs problems or physics preservation during training. In this work, we focus on a general transfer learning method that adaptively correct the domain shift and preserve physical information. Mathematically, we characterize the data domain as product distribution and the essential problems as distribution bias and operator bias. A Physics-preserved Optimal Tensor Transport (POTT) method that simultaneously admits generalizability to common DEs and physics preservation of specific problem is proposed to adapt the data-driven model to target domain utilizing the push-forward distribution induced by the POTT map. Extensive experiments demonstrate the superior performance, generalizability and physics preservation of the proposed POTT method.
CVJun 24, 2024
Geometric Understanding of Discriminability and Transferability for Visual Domain AdaptationYou-Wei Luo, Chuan-Xian Ren, Xiao-Lin Xu et al.
To overcome the restriction of identical distribution assumption, invariant representation learning for unsupervised domain adaptation (UDA) has made significant advances in computer vision and pattern recognition communities. In UDA scenario, the training and test data belong to different domains while the task model is learned to be invariant. Recently, empirical connections between transferability and discriminability have received increasing attention, which is the key to understanding the invariant representations. However, theoretical study of these abilities and in-depth analysis of the learned feature structures are unexplored yet. In this work, we systematically analyze the essentials of transferability and discriminability from the geometric perspective. Our theoretical results provide insights into understanding the co-regularization relation and prove the possibility of learning these abilities. From methodology aspect, the abilities are formulated as geometric properties between domain/cluster subspaces (i.e., orthogonality and equivalence) and characterized as the relation between the norms/ranks of multiple matrices. Two optimization-friendly learning principles are derived, which also ensure some intuitive explanations. Moreover, a feasible range for the co-regularization parameters is deduced to balance the learning of geometric structures. Based on the theoretical results, a geometry-oriented model is proposed for enhancing the transferability and discriminability via nuclear norm optimization. Extensive experiment results validate the effectiveness of the proposed model in empirical applications, and verify that the geometric abilities can be sufficiently learned in the derived feasible range.
LGJun 24, 2024
When Invariant Representation Learning Meets Label Shift: Insufficiency and Theoretical InsightsYou-Wei Luo, Chuan-Xian Ren
As a crucial step toward real-world learning scenarios with changing environments, dataset shift theory and invariant representation learning algorithm have been extensively studied to relax the identical distribution assumption in classical learning setting. Among the different assumptions on the essential of shifting distributions, generalized label shift (GLS) is the latest developed one which shows great potential to deal with the complex factors within the shift. In this paper, we aim to explore the limitations of current dataset shift theory and algorithm, and further provide new insights by presenting a comprehensive understanding of GLS. From theoretical aspect, two informative generalization bounds are derived, and the GLS learner is proved to be sufficiently close to optimal target model from the Bayesian perspective. The main results show the insufficiency of invariant representation learning, and prove the sufficiency and necessity of GLS correction for generalization, which provide theoretical supports and innovations for exploring generalizable model under dataset shift. From methodological aspect, we provide a unified view of existing shift correction frameworks, and propose a kernel embedding-based correction algorithm (KECA) to minimize the generalization error and achieve successful knowledge transfer. Both theoretical results and extensive experiment evaluations demonstrate the sufficiency and necessity of GLS correction for addressing dataset shift and the superiority of proposed algorithm.
LGFeb 26, 2022
Generalized Label Shift Correction via Minimum Uncertainty Principle: Theory and AlgorithmYou-Wei Luo, Chuan-Xian Ren
As a fundamental problem in machine learning, dataset shift induces a paradigm to learn and transfer knowledge under changing environment. Previous methods assume the changes are induced by covariate, which is less practical for complex real-world data. We consider the Generalized Label Shift (GLS), which provides an interpretable insight into the learning and transfer of desirable knowledge. Current GLS methods: 1) are not well-connected with the statistical learning theory; 2) usually assume the shifting conditional distributions will be matched with an implicit transformation, but its explicit modeling is unexplored. In this paper, we propose a conditional adaptation framework to deal with these challenges. From the perspective of learning theory, we prove that the generalization error of conditional adaptation is lower than previous covariate adaptation. Following the theoretical results, we propose the minimum uncertainty principle to learn conditional invariant transformation via discrepancy optimization. Specifically, we propose the \textit{conditional metric operator} on Hilbert space to characterize the distinctness of conditional distributions. For finite observations, we prove that the empirical estimation is always well-defined and will converge to underlying truth as sample size increases. The results of extensive experiments demonstrate that the proposed model achieves competitive performance under different GLS scenarios.
IVFeb 19, 2022
SPNet: A novel deep neural network for retinal vessel segmentation based on shared decoder and pyramid-like lossGeng-Xin Xu, Chuan-Xian Ren
Segmentation of retinal vessel images is critical to the diagnosis of retinopathy. Recently, convolutional neural networks have shown significant ability to extract the blood vessel structure. However, it remains challenging to refined segmentation for the capillaries and the edges of retinal vessels due to thickness inconsistencies and blurry boundaries. In this paper, we propose a novel deep neural network for retinal vessel segmentation based on shared decoder and pyramid-like loss (SPNet) to address the above problems. Specifically, we introduce a decoder-sharing mechanism to capture multi-scale semantic information, where feature maps at diverse scales are decoded through a sequence of weight-sharing decoder modules. Also, to strengthen characterization on the capillaries and the edges of blood vessels, we define a residual pyramid architecture which decomposes the spatial information in the decoding phase. A pyramid-like loss function is designed to compensate possible segmentation errors progressively. Experimental results on public benchmarks show that the proposed method outperforms the backbone network and the state-of-the-art methods, especially in the regions of the capillaries and the vessel contours. In addition, performances on cross-datasets verify that SPNet shows stronger generalization ability.
LGDec 21, 2021
Geometry-Aware Unsupervised Domain AdaptationYou-Wei Luo, Chuan-Xian Ren, Zi-Ying Chen
Unsupervised Domain Adaptation (UDA) aims to transfer the knowledge from the labeled source domain to the unlabeled target domain in the presence of dataset shift. Most existing methods cannot address the domain alignment and class discrimination well, which may distort the intrinsic data structure for downstream tasks (e.g., classification). To this end, we propose a novel geometry-aware model to learn the transferability and discriminability simultaneously via nuclear norm optimization. We introduce the domain coherence and class orthogonality for UDA from the perspective of subspace geometry. The domain coherence will ensure the model has a larger capacity for learning separable representations, and class orthogonality will minimize the correlation between clusters to alleviate the misalignment. So, they are consistent and can benefit from each other. Besides, we provide a theoretical insight into the norm-based learning literature in UDA, which ensures the interpretability of our model. We show that the norms of domains and clusters are expected to be larger and smaller to enhance the transferability and discriminability, respectively. Extensive experimental results on standard UDA datasets demonstrate the effectiveness of our theory and model.
IVSep 8, 2021
Cross-Site Severity Assessment of COVID-19 from CT Images via Domain AdaptationGeng-Xin Xu, Chen Liu, Jun Liu et al.
Early and accurate severity assessment of Coronavirus disease 2019 (COVID-19) based on computed tomography (CT) images offers a great help to the estimation of intensive care unit event and the clinical decision of treatment planning. To augment the labeled data and improve the generalization ability of the classification model, it is necessary to aggregate data from multiple sites. This task faces several challenges including class imbalance between mild and severe infections, domain distribution discrepancy between sites, and presence of heterogeneous features. In this paper, we propose a novel domain adaptation (DA) method with two components to address these problems. The first component is a stochastic class-balanced boosting sampling strategy that overcomes the imbalanced learning problem and improves the classification performance on poorly-predicted classes. The second component is a representation learning that guarantees three properties: 1) domain-transferability by prototype triplet loss, 2) discriminant by conditional maximum mean discrepancy loss, and 3) completeness by multi-view reconstruction loss. Particularly, we propose a domain translator and align the heterogeneous data to the estimated class prototypes (i.e., class centers) in a hyper-sphere manifold. Experiments on cross-site severity assessment of COVID-19 from CT images show that the proposed method can effectively tackle the imbalanced learning problem and outperform recent DA approaches.
LGJul 31, 2021
Conditional Bures Metric for Domain AdaptationYou-Wei Luo, Chuan-Xian Ren
As a vital problem in classification-oriented transfer, unsupervised domain adaptation (UDA) has attracted widespread attention in recent years. Previous UDA methods assume the marginal distributions of different domains are shifted while ignoring the discriminant information in the label distributions. This leads to classification performance degeneration in real applications. In this work, we focus on the conditional distribution shift problem which is of great concern to current conditional invariant models. We aim to seek a kernel covariance embedding for conditional distribution which remains yet unexplored. Theoretically, we propose the Conditional Kernel Bures (CKB) metric for characterizing conditional distribution discrepancy, and derive an empirical estimation for the CKB metric without introducing the implicit kernel feature map. It provides an interpretable approach to understand the knowledge transfer mechanism. The established consistency theory of the empirical estimation provides a theoretical guarantee for convergence. A conditional distribution matching network is proposed to learn the conditional invariant and discriminative features for UDA. Extensive experiments and analysis show the superiority of our proposed model.
CVAug 25, 2020
Learning Target Domain Specific Classifier for Partial Domain AdaptationChuan-Xian Ren, Pengfei Ge, Peiyi Yang et al.
Unsupervised domain adaptation~(UDA) aims at reducing the distribution discrepancy when transferring knowledge from a labeled source domain to an unlabeled target domain. Previous UDA methods assume that the source and target domains share an identical label space, which is unrealistic in practice since the label information of the target domain is agnostic. This paper focuses on a more realistic UDA scenario, i.e. partial domain adaptation (PDA), where the target label space is subsumed to the source label space. In the PDA scenario, the source outliers that are absent in the target domain may be wrongly matched to the target domain (technically named negative transfer), leading to performance degradation of UDA methods. This paper proposes a novel Target Domain Specific Classifier Learning-based Domain Adaptation (TSCDA) method. TSCDA presents a soft-weighed maximum mean discrepancy criterion to partially align feature distributions and alleviate negative transfer. Also, it learns a target-specific classifier for the target domain with pseudo-labels and multiple auxiliary classifiers, to further address classifier shift. A module named Peers Assisted Learning is used to minimize the prediction difference between multiple target-specific classifiers, which makes the classifiers more discriminant for the target domain. Extensive experiments conducted on three PDA benchmark datasets show that TSCDA outperforms other state-of-the-art methods with a large margin, e.g. $4\%$ and $5.6\%$ averagely on Office-31 and Office-Home, respectively.
CVAug 24, 2020
Learning Kernel for Conditional Moment-Matching Discrepancy-based Image ClassificationChuan-Xian Ren, Pengfei Ge, Dao-Qing Dai et al.
Conditional Maximum Mean Discrepancy (CMMD) can capture the discrepancy between conditional distributions by drawing support from nonlinear kernel functions, thus it has been successfully used for pattern classification. However, CMMD does not work well on complex distributions, especially when the kernel function fails to correctly characterize the difference between intra-class similarity and inter-class similarity. In this paper, a new kernel learning method is proposed to improve the discrimination performance of CMMD. It can be operated with deep network features iteratively and thus denoted as KLN for abbreviation. The CMMD loss and an auto-encoder (AE) are used to learn an injective function. By considering the compound kernel, i.e., the injective function with a characteristic kernel, the effectiveness of CMMD for data category description is enhanced. KLN can simultaneously learn a more expressive kernel and label prediction distribution, thus, it can be used to improve the classification performance in both supervised and semi-supervised learning scenarios. In particular, the kernel-based similarities are iteratively learned on the deep network features, and the algorithm can be implemented in an end-to-end manner. Extensive experiments are conducted on four benchmark datasets, including MNIST, SVHN, CIFAR-10 and CIFAR-100. The results indicate that KLN achieves state-of-the-art classification performance.
CVAug 23, 2020
Dual Adversarial Auto-Encoders for ClusteringPengfei Ge, Chuan-Xian Ren, Jiashi Feng et al.
As a powerful approach for exploratory data analysis, unsupervised clustering is a fundamental task in computer vision and pattern recognition. Many clustering algorithms have been developed, but most of them perform unsatisfactorily on the data with complex structures. Recently, Adversarial Auto-Encoder (AAE) shows effectiveness on tackling such data by combining Auto-Encoder (AE) and adversarial training, but it cannot effectively extract classification information from the unlabeled data. In this work, we propose Dual Adversarial Auto-encoder (Dual-AAE) which simultaneously maximizes the likelihood function and mutual information between observed examples and a subset of latent variables. By performing variational inference on the objective function of Dual-AAE, we derive a new reconstruction loss which can be optimized by training a pair of Auto-encoders. Moreover, to avoid mode collapse, we introduce the clustering regularization term for the category variable. Experiments on four benchmarks show that Dual-AAE achieves superior performance over state-of-the-art clustering methods. Besides, by adding a reject option, the clustering accuracy of Dual-AAE can reach that of supervised CNN algorithms. Dual-AAE can also be used for disentangling style and content of images without using supervised information.
LGAug 23, 2020
Unsupervised Domain Adaptation via Discriminative Manifold PropagationYou-Wei Luo, Chuan-Xian Ren, Dao-Qing Dai et al.
Unsupervised domain adaptation is effective in leveraging rich information from a labeled source domain to an unlabeled target domain. Though deep learning and adversarial strategy made a significant breakthrough in the adaptability of features, there are two issues to be further studied. First, hard-assigned pseudo labels on the target domain are arbitrary and error-prone, and direct application of them may destroy the intrinsic data structure. Second, batch-wise training of deep learning limits the characterization of the global structure. In this paper, a Riemannian manifold learning framework is proposed to achieve transferability and discriminability simultaneously. For the first issue, this framework establishes a probabilistic discriminant criterion on the target domain via soft labels. Based on pre-built prototypes, this criterion is extended to a global approximation scheme for the second issue. Manifold metric alignment is adopted to be compatible with the embedding space. The theoretical error bounds of different alignment metrics are derived for constructive guidance. The proposed method can be used to tackle a series of variants of domain adaptation problems, including both vanilla and partial settings. Extensive experiments have been conducted to investigate the method and a comparative study shows the superiority of the discriminative manifold learning framework.
CVAug 23, 2020
Discriminative Residual Analysis for Image Set Classification with Posture and Age VariationsChuan-Xian Ren, You-Wei Luo, Xiao-Lin Xu et al.
Image set recognition has been widely applied in many practical problems like real-time video retrieval and image caption tasks. Due to its superior performance, it has grown into a significant topic in recent years. However, images with complicated variations, e.g., postures and human ages, are difficult to address, as these variations are continuous and gradual with respect to image appearance. Consequently, the crucial point of image set recognition is to mine the intrinsic connection or structural information from the image batches with variations. In this work, a Discriminant Residual Analysis (DRA) method is proposed to improve the classification performance by discovering discriminant features in related and unrelated groups. Specifically, DRA attempts to obtain a powerful projection which casts the residual representations into a discriminant subspace. Such a projection subspace is expected to magnify the useful information of the input space as much as possible, then the relation between the training set and the test set described by the given metric or distance will be more precise in the discriminant subspace. We also propose a nonfeasance strategy by defining another approach to construct the unrelated groups, which help to reduce furthermore the cost of sampling errors. Two regularization approaches are used to deal with the probable small sample size problem. Extensive experiments are conducted on benchmark databases, and the results show superiority and efficiency of the new methods.
CVJun 14, 2020
Domain Adaptation and Image Classification via Deep Conditional Adaptation NetworkPengfei Ge, Chuan-Xian Ren, Dao-Qing Dai et al.
Unsupervised domain adaptation aims to generalize the supervised model trained on a source domain to an unlabeled target domain. Marginal distribution alignment of feature spaces is widely used to reduce the domain discrepancy between the source and target domains. However, it assumes that the source and target domains share the same label distribution, which limits their application scope. In this paper, we consider a more general application scenario where the label distributions of the source and target domains are not the same. In this scenario, marginal distribution alignment-based methods will be vulnerable to negative transfer. To address this issue, we propose a novel unsupervised domain adaptation method, Deep Conditional Adaptation Network (DCAN), based on conditional distribution alignment of feature spaces. To be specific, we reduce the domain discrepancy by minimizing the Conditional Maximum Mean Discrepancy between the conditional distributions of deep features on the source and target domains, and extract the discriminant information from target domain by maximizing the mutual information between samples and the prediction labels. In addition, DCAN can be used to address a special scenario, Partial unsupervised domain adaptation, where the target domain category is a subset of the source domain category. Experiments on both unsupervised domain adaptation and Partial unsupervised domain adaptation show that DCAN achieves superior classification performance over state-of-the-art methods.
LGFeb 20, 2020
Unsupervised Domain Adaptation via Discriminative Manifold Embedding and AlignmentYou-Wei Luo, Chuan-Xian Ren, Pengfei Ge et al.
Unsupervised domain adaptation is effective in leveraging the rich information from the source domain to the unsupervised target domain. Though deep learning and adversarial strategy make an important breakthrough in the adaptability of features, there are two issues to be further explored. First, the hard-assigned pseudo labels on the target domain are risky to the intrinsic data structure. Second, the batch-wise training manner in deep learning limits the description of the global structure. In this paper, a Riemannian manifold learning framework is proposed to achieve transferability and discriminability consistently. As to the first problem, this method establishes a probabilistic discriminant criterion on the target domain via soft labels. Further, this criterion is extended to a global approximation scheme for the second issue; such approximation is also memory-saving. The manifold metric alignment is exploited to be compatible with the embedding space. A theoretical error bound is derived to facilitate the alignment. Extensive experiments have been conducted to investigate the proposal and results of the comparison study manifest the superiority of consistent manifold learning framework.
CVMay 14, 2019
Domain Adaptive Person Re-Identification via Camera Style Generation and Label PropagationChuan-Xian Ren, Bo-Hua Liang, Zhen Lei
Unsupervised domain adaptation in person re-identification resorts to labeled source data to promote the model training on target domain, facing the dilemmas caused by large domain shift and large camera variations. The non-overlapping labels challenge that source domain and target domain have entirely different persons further increases the re-identification difficulty. In this paper, we propose a novel algorithm to narrow such domain gaps. We derive a camera style adaptation framework to learn the style-based mappings between different camera views, from the target domain to the source domain, and then we can transfer the identity-based distribution from the source domain to the target domain on the camera level. To overcome the non-overlapping labels challenge and guide the person re-identification model to narrow the gap further, an efficient and effective soft-labeling method is proposed to mine the intrinsic local structure of the target domain through building the connection between GAN-translated source domain and the target domain. Experiment results conducted on real benchmark datasets indicate that our method gets state-of-the-art results.