CVMay 29, 2022Code
Image Super-resolution with An Enhanced Group Convolutional Neural NetworkChunwei Tian, Yixuan Yuan, Shichao Zhang et al.
CNNs with strong learning abilities are widely chosen to resolve super-resolution problem. However, CNNs depend on deeper network architectures to improve performance of image super-resolution, which may increase computational cost in general. In this paper, we present an enhanced super-resolution group CNN (ESRGCNN) with a shallow architecture by fully fusing deep and wide channel features to extract more accurate low-frequency information in terms of correlations of different channels in single image super-resolution (SISR). Also, a signal enhancement operation in the ESRGCNN is useful to inherit more long-distance contextual information for resolving long-term dependency. An adaptive up-sampling operation is gathered into a CNN to obtain an image super-resolution model with low-resolution images of different sizes. Extensive experiments report that our ESRGCNN surpasses the state-of-the-arts in terms of SISR performance, complexity, execution speed, image quality evaluation and visual effect in SISR. Code is found at https://github.com/hellloxiaotian/ESRGCNN.
IVOct 16, 2023Code
A cross Transformer for image denoisingChunwei Tian, Menghua Zheng, Wangmeng Zuo et al.
Deep convolutional neural networks (CNNs) depend on feedforward and feedback ways to obtain good performance in image denoising. However, how to obtain effective structural information via CNNs to efficiently represent given noisy images is key for complex scenes. In this paper, we propose a cross Transformer denoising CNN (CTNet) with a serial block (SB), a parallel block (PB), and a residual block (RB) to obtain clean images for complex scenes. A SB uses an enhanced residual architecture to deeply search structural information for image denoising. To avoid loss of key information, PB uses three heterogeneous networks to implement multiple interactions of multi-level features to broadly search for extra information for improving the adaptability of an obtained denoiser for complex scenes. Also, to improve denoising performance, Transformer mechanisms are embedded into the SB and PB to extract complementary salient features for effectively removing noise in terms of pixel relations. Finally, a RB is applied to acquire clean images. Experiments illustrate that our CTNet is superior to some popular denoising methods in terms of real and synthetic image denoising. It is suitable to mobile digital devices, i.e., phones. Codes can be obtained at https://github.com/hellloxiaotian/CTNet.
QUANT-PHJul 14, 2022
QSAN: A Near-term Achievable Quantum Self-Attention NetworkJinjing Shi, Ren-Xin Zhao, Wenxuan Wang et al.
Self-Attention Mechanism (SAM) is good at capturing the internal connections of features and greatly improves the performance of machine learning models, espeacially requiring efficient characterization and feature extraction of high-dimensional data. A novel Quantum Self-Attention Network (QSAN) is proposed for image classification tasks on near-term quantum devices. First, a Quantum Self-Attention Mechanism (QSAM) including Quantum Logic Similarity (QLS) and Quantum Bit Self-Attention Score Matrix (QBSASM) is explored as the theoretical basis of QSAN to enhance the data representation of SAM. QLS is employed to prevent measurements from obtaining inner products to allow QSAN to be fully implemented on quantum computers, and QBSASM as a result of the evolution of QSAN to produce a density matrix that effectively reflects the attention distribution of the output. Then, the framework for one-step realization and quantum circuits of QSAN are designed for fully considering the compression of the measurement times to acquire QBSASM in the intermediate process, in which a quantum coordinate prototype is introduced as well in the quantum circuit for describing the mathematical relation between the output and control bits to facilitate programming. Ultimately, the method comparision and binary classification experiments on MNIST with the pennylane platform demonstrate that QSAN converges about 1.7x and 2.3x faster than hardware-efficient ansatz and QAOA ansatz respevtively with similar parameter configurations and 100% prediction accuracy, which indicates it has a better learning capability. QSAN is quite suitable for fast and in-depth analysis of the primary and secondary relationships of image and other data, which has great potential for applications of quantum computer vision from the perspective of enhancing the information extraction ability of models.
21.5CVMay 6Code
A cross-modal network for facial expression recognitionChunwei Tian, Jingyuan Xie, Qi Zhang et al.
Deep neural networks enriched with structural information have been widely employed for facial expression recognition tasks. However, these methods often depend on hierarchical information rather than face property to finish expression recognition. In this paper, we propose a cross-modal network with strong biological and structural information for facial expression recognition (CMNet). CMNet can respectively learn expression information via face symmetry on a whole face, left and right half faces to extract complementary facial features. To prevent negative effect of biological and structural information fusion, a salient facial information refinement module can obtain salient facial expression information to improve stability of an obtained facial expression classifier. To reduce reliance on unilateral facial features, a half-face alignment optimization mechanism is designed to align obtained expression information of learned left and right half faces. Our experimental results demonstrate that CMNet outperforms several novel methods, i.e., SCN and LAENet-SA for facial expression recognition. Codes can be obtained at https://github.com/hellloxiaotian/CMNet.
IRSep 30, 2024
Mitigating Propensity Bias of Large Language Models for Recommender SystemsGuixian Zhang, Guan Yuan, Debo Cheng et al.
The rapid development of Large Language Models (LLMs) creates new opportunities for recommender systems, especially by exploiting the side information (e.g., descriptions and analyses of items) generated by these models. However, aligning this side information with collaborative information from historical interactions poses significant challenges. The inherent biases within LLMs can skew recommendations, resulting in distorted and potentially unfair user experiences. On the other hand, propensity bias causes side information to be aligned in such a way that it often tends to represent all inputs in a low-dimensional subspace, leading to a phenomenon known as dimensional collapse, which severely restricts the recommender system's ability to capture user preferences and behaviours. To address these issues, we introduce a novel framework named Counterfactual LLM Recommendation (CLLMR). Specifically, we propose a spectrum-based side information encoder that implicitly embeds structural information from historical interactions into the side information representation, thereby circumventing the risk of dimension collapse. Furthermore, our CLLMR approach explores the causal relationships inherent in LLM-based recommender systems. By leveraging counterfactual inference, we counteract the biases introduced by LLMs. Extensive experiments demonstrate that our CLLMR approach consistently enhances the performance of various recommender models.
IRAug 19, 2024
Interaction-Data-guided Conditional Instrumental Variables for Debiasing Recommender SystemsZhirong Huang, Debo Cheng, Jiuyong Li et al.
It is often challenging to identify a valid instrumental variable (IV), although the IV methods have been regarded as effective tools of addressing the confounding bias introduced by latent variables. To deal with this issue, an Interaction-Data-guided Conditional IV (IDCIV) debiasing method is proposed for Recommender Systems, called IDCIV-RS. The IDCIV-RS automatically generates the representations of valid CIVs and their corresponding conditioning sets directly from interaction data, significantly reducing the complexity of IV selection while effectively mitigating the confounding bias caused by latent variables in recommender systems. Specifically, the IDCIV-RS leverages a variational autoencoder (VAE) to learn both the CIV representations and their conditioning sets from interaction data, followed by the application of least squares to derive causal representations for click prediction. Extensive experiments on two real-world datasets, Movielens-10M and Douban-Movie, demonstrate that IDCIV-RS successfully learns the representations of valid CIVs, effectively reduces bias, and consequently improves recommendation accuracy.
CVMar 4, 2024Code
Perceptive self-supervised learning network for noisy image watermark removalChunwei Tian, Menghua Zheng, Bo Li et al.
Popular methods usually use a degradation model in a supervised way to learn a watermark removal model. However, it is true that reference images are difficult to obtain in the real world, as well as collected images by cameras suffer from noise. To overcome these drawbacks, we propose a perceptive self-supervised learning network for noisy image watermark removal (PSLNet) in this paper. PSLNet depends on a parallel network to remove noise and watermarks. The upper network uses task decomposition ideas to remove noise and watermarks in sequence. The lower network utilizes the degradation model idea to simultaneously remove noise and watermarks. Specifically, mentioned paired watermark images are obtained in a self supervised way, and paired noisy images (i.e., noisy and reference images) are obtained in a supervised way. To enhance the clarity of obtained images, interacting two sub-networks and fusing obtained clean images are used to improve the effects of image watermark removal in terms of structural information and pixel enhancement. Taking into texture information account, a mixed loss uses obtained images and features to achieve a robust model of noisy image watermark removal. Comprehensive experiments show that our proposed method is very effective in comparison with popular convolutional neural networks (CNNs) for noisy image watermark removal. Codes can be obtained at https://github.com/hellloxiaotian/PSLNet.
LGAug 19, 2024
Community-Centric Graph UnlearningYi Li, Shichao Zhang, Guixian Zhang et al.
Graph unlearning technology has become increasingly important since the advent of the `right to be forgotten' and the growing concerns about the privacy and security of artificial intelligence. Graph unlearning aims to quickly eliminate the effects of specific data on graph neural networks (GNNs). However, most existing deterministic graph unlearning frameworks follow a balanced partition-submodel training-aggregation paradigm, resulting in a lack of structural information between subgraph neighborhoods and redundant unlearning parameter calculations. To address this issue, we propose a novel Graph Structure Mapping Unlearning paradigm (GSMU) and a novel method based on it named Community-centric Graph Eraser (CGE). CGE maps community subgraphs to nodes, thereby enabling the reconstruction of a node-level unlearning operation within a reduced mapped graph. CGE makes the exponential reduction of both the amount of training data and the number of unlearning parameters. Extensive experiments conducted on five real-world datasets and three widely used GNN backbones have verified the high performance and efficiency of our CGE method, highlighting its potential in the field of graph unlearning.
LGNov 6, 2025
DeNoise: Learning Robust Graph Representations for Unsupervised Graph-Level Anomaly DetectionQingfeng Chen, Haojin Zeng, Jingyi Jie et al.
With the rapid growth of graph-structured data in critical domains, unsupervised graph-level anomaly detection (UGAD) has become a pivotal task. UGAD seeks to identify entire graphs that deviate from normal behavioral patterns. However, most Graph Neural Network (GNN) approaches implicitly assume that the training set is clean, containing only normal graphs, which is rarely true in practice. Even modest contamination by anomalous graphs can distort learned representations and sharply degrade performance. To address this challenge, we propose DeNoise, a robust UGAD framework explicitly designed for contaminated training data. It jointly optimizes a graph-level encoder, an attribute decoder, and a structure decoder via an adversarial objective to learn noise-resistant embeddings. Further, DeNoise introduces an encoder anchor-alignment denoising mechanism that fuses high-information node embeddings from normal graphs into all graph embeddings, improving representation quality while suppressing anomaly interference. A contrastive learning component then compacts normal graph embeddings and repels anomalous ones in the latent space. Extensive experiments on eight real-world datasets demonstrate that DeNoise consistently learns reliable graph-level representations under varying noise intensities and significantly outperforms state-of-the-art UGAD baselines.
IRAug 19, 2024
Debiased Contrastive Representation Learning for Mitigating Dual Biases in Recommender SystemsZhirong Huang, Shichao Zhang, Debo Cheng et al.
In recommender systems, popularity and conformity biases undermine recommender effectiveness by disproportionately favouring popular items, leading to their over-representation in recommendation lists and causing an unbalanced distribution of user-item historical data. We construct a causal graph to address both biases and describe the abstract data generation mechanism. Then, we use it as a guide to develop a novel Debiased Contrastive Learning framework for Mitigating Dual Biases, called DCLMDB. In DCLMDB, both popularity bias and conformity bias are handled in the model training process by contrastive learning to ensure that user choices and recommended items are not unduly influenced by conformity and popularity. Extensive experiments on two real-world datasets, Movielens-10M and Netflix, show that DCLMDB can effectively reduce the dual biases, as well as significantly enhance the accuracy and diversity of recommendations.
LGJun 6, 2022
Hashing Learning with Hyper-Class RepresentationShichao Zhang, Jiaye Li
Existing unsupervised hash learning is a kind of attribute-centered calculation. It may not accurately preserve the similarity between data. This leads to low down the performance of hash function learning. In this paper, a hash algorithm is proposed with a hyper-class representation. It is a two-steps approach. The first step finds potential decision features and establish hyper-class. The second step constructs hash learning based on the hyper-class information in the first step, so that the hash codes of the data within the hyper-class are as similar as possible, as well as the hash codes of the data between the hyper-classes are as different as possible. To evaluate the efficiency, a series of experiments are conducted on four public datasets. The experimental results show that the proposed hash algorithm is more efficient than the compared algorithms, in terms of mean average precision (MAP), average precision (AP) and Hamming radius 2 (HAM2)
LGJun 16, 2025Code
Uncertainty-Aware Graph Neural Networks: A Multi-Hop Evidence Fusion ApproachQingfeng Chen, Shiyuan Li, Yixin Liu et al.
Graph neural networks (GNNs) excel in graph representation learning by integrating graph structure and node features. Existing GNNs, unfortunately, fail to account for the uncertainty of class probabilities that vary with the depth of the model, leading to unreliable and risky predictions in real-world scenarios. To bridge the gap, in this paper, we propose a novel Evidence Fusing Graph Neural Network (EFGNN for short) to achieve trustworthy prediction, enhance node classification accuracy, and make explicit the risk of wrong predictions. In particular, we integrate the evidence theory with multi-hop propagation-based GNN architecture to quantify the prediction uncertainty of each node with the consideration of multiple receptive fields. Moreover, a parameter-free cumulative belief fusion (CBF) mechanism is developed to leverage the changes in prediction uncertainty and fuse the evidence to improve the trustworthiness of the final prediction. To effectively optimize the EFGNN model, we carefully design a joint learning objective composed of evidence cross-entropy, dissonance coefficient, and false confident penalty. The experimental results on various datasets and theoretical analyses demonstrate the effectiveness of the proposed model in terms of accuracy and trustworthiness, as well as its robustness to potential attacks. The source code of EFGNN is available at https://github.com/Shiy-Li/EFGNN.
LGJan 23
kNN-Graph: An adaptive graph model for $k$-nearest neighborsJiaye Li, Gang Chen, Hang Xu et al.
The k-nearest neighbors (kNN) algorithm is a cornerstone of non-parametric classification in artificial intelligence, yet its deployment in large-scale applications is persistently constrained by the computational trade-off between inference speed and accuracy. Existing approximate nearest neighbor solutions accelerate retrieval but often degrade classification precision and lack adaptability in selecting the optimal neighborhood size (k). Here, we present an adaptive graph model that decouples inference latency from computational complexity. By integrating a Hierarchical Navigable Small World (HNSW) graph with a pre-computed voting mechanism, our framework completely transfers the computational burden of neighbor selection and weighting to the training phase. Within this topological structure, higher graph layers enable rapid navigation, while lower layers encode precise, node-specific decision boundaries with adaptive neighbor counts. Benchmarking against eight state-of-the-art baselines across six diverse datasets, we demonstrate that this architecture significantly accelerates inference speeds, achieving real-time performance, without compromising classification accuracy. These findings offer a scalable, robust solution to the long-standing inference bottleneck of kNN, establishing a new structural paradigm for graph-based nonparametric learning.
AIOct 18, 2025Code
Humanoid-inspired Causal Representation Learning for Domain GeneralizationZe Tao, Jian Zhang, Haowei Li et al.
This paper proposes the Humanoid-inspired Structural Causal Model (HSCM), a novel causal framework inspired by human intelligence, designed to overcome the limitations of conventional domain generalization models. Unlike approaches that rely on statistics to capture data-label dependencies and learn distortion-invariant representations, HSCM replicates the hierarchical processing and multi-level learning of human vision systems, focusing on modeling fine-grained causal mechanisms. By disentangling and reweighting key image attributes such as color, texture, and shape, HSCM enhances generalization across diverse domains, ensuring robust performance and interpretability. Leveraging the flexibility and adaptability of human intelligence, our approach enables more effective transfer and learning in dynamic, complex environments. Through both theoretical and empirical evaluations, we demonstrate that HSCM outperforms existing domain generalization models, providing a more principled method for capturing causal relationships and improving model robustness. The code is available at https://github.com/lambett/HSCM.
CVMar 2
Dehallu3D: Hallucination-Mitigated 3D Generation from Single Image via Cyclic View Consistency RefinementXiwen Wang, Shichao Zhang, Hailun Zhang et al.
Large 3D reconstruction models have revolutionized the 3D content generation field, enabling broad applications in virtual reality and gaming. Just like other large models, large 3D reconstruction models suffer from hallucinations as well, introducing structural outliers (e.g., odd holes or protrusions) that deviate from the input data. However, unlike other large models, hallucinations in large 3D reconstruction models remain severely underexplored, leading to malformed 3D-printed objects or insufficient immersion in virtual scenes. Such hallucinations majorly originate from that existing methods reconstruct 3D content from sparsely generated multi-view images which suffer from large viewpoint gaps and discontinuities. To mitigate hallucinations by eliminating the outliers, we propose Dehallu3D for 3D mesh generation. Our key idea is to design a balanced multi-view continuity constraint to enforce smooth transitions across dense intermediate viewpoints, while avoiding over-smoothing that could erase sharp geometric features. Therefore, Dehallu3D employs a plug-and-play optimization module with two key constraints: (i) adjacent consistency to ensure geometric continuity across views, and (ii) adaptive smoothness to retain fine details.We further propose the Outlier Risk Measure (ORM) metric to quantify geometric fidelity in 3D generation from the perspective of outliers. Extensive experiments show that Dehallu3D achieves high-fidelity 3D generation by effectively preserving structural details while removing hallucinated outliers.
CVJan 19, 2025
Generative Physical AI in Vision: A SurveyDaochang Liu, Junyu Zhang, Anh-Dung Dinh et al.
Generative Artificial Intelligence (AI) has rapidly advanced the field of computer vision by enabling machines to create and interpret visual data with unprecedented sophistication. This transformation builds upon a foundation of generative models to produce realistic images, videos, and 3D/4D content. Conventional generative models primarily focus on visual fidelity while often neglecting the physical plausibility of the generated content. This gap limits their effectiveness in applications that require adherence to real-world physical laws, such as robotics, autonomous systems, and scientific simulations. As generative models evolve to increasingly integrate physical realism and dynamic simulation, their potential to function as "world simulators" expands. Therefore, the field of physics-aware generation in computer vision is rapidly growing, calling for a comprehensive survey to provide a structured analysis of current efforts. To serve this purpose, the survey presents a systematic review, categorizing methods based on how they incorporate physical knowledge, either through explicit simulation or implicit learning. It also analyzes key paradigms, discusses evaluation protocols, and identifies future research directions. By offering a comprehensive overview, this survey aims to help future developments in physically grounded generation for computer vision. The reviewed papers are summarized at https://tinyurl.com/Physics-Aware-Generation.
CVFeb 1, 2025
Efficient Adaptive Label Refinement for Label Noise LearningWenzhen Zhang, Debo Cheng, Guangquan Lu et al.
Deep neural networks are highly susceptible to overfitting noisy labels, which leads to degraded performance. Existing methods address this issue by employing manually defined criteria, aiming to achieve optimal partitioning in each iteration to avoid fitting noisy labels while thoroughly learning clean samples. However, this often results in overly complex and difficult-to-train models. To address this issue, we decouple the tasks of avoiding fitting incorrect labels and thoroughly learning clean samples and propose a simple yet highly applicable method called Adaptive Label Refinement (ALR). First, inspired by label refurbishment techniques, we update the original hard labels to soft labels using the model's predictions to reduce the risk of fitting incorrect labels. Then, by introducing the entropy loss, we gradually `harden' the high-confidence soft labels, guiding the model to better learn from clean samples. This approach is simple and efficient, requiring no prior knowledge of noise or auxiliary datasets, making it more accessible compared to existing methods. We validate ALR's effectiveness through experiments on benchmark datasets with artificial label noise (CIFAR-10/100) and real-world datasets with inherent noise (ANIMAL-10N, Clothing1M, WebVision). The results show that ALR outperforms state-of-the-art methods.
CVNov 13, 2024
UIFormer: A Unified Transformer-based Framework for Incremental Few-Shot Object Detection and Instance SegmentationChengyuan Zhang, Yilin Zhang, Lei Zhu et al.
This paper introduces a novel framework for unified incremental few-shot object detection (iFSOD) and instance segmentation (iFSIS) using the Transformer architecture. Our goal is to create an optimal solution for situations where only a few examples of novel object classes are available, with no access to training data for base or old classes, while maintaining high performance across both base and novel classes. To achieve this, We extend Mask-DINO into a two-stage incremental learning framework. Stage 1 focuses on optimizing the model using the base dataset, while Stage 2 involves fine-tuning the model on novel classes. Besides, we incorporate a classifier selection strategy that assigns appropriate classifiers to the encoder and decoder according to their distinct functions. Empirical evidence indicates that this approach effectively mitigates the over-fitting on novel classes learning. Furthermore, we implement knowledge distillation to prevent catastrophic forgetting of base classes. Comprehensive evaluations on the COCO and LVIS datasets for both iFSIS and iFSOD tasks demonstrate that our method significantly outperforms state-of-the-art approaches.
CVJun 3, 2025
Application of convolutional neural networks in image super-resolutionChunwei Tian, Mingjian Song, Wangmeng Zuo et al.
Due to strong learning abilities of convolutional neural networks (CNNs), they have become mainstream methods for image super-resolution. However, there are big differences of different deep learning methods with different types. There is little literature to summarize relations and differences of different methods in image super-resolution. Thus, summarizing these literatures are important, according to loading capacity and execution speed of devices. This paper first introduces principles of CNNs in image super-resolution, then introduces CNNs based bicubic interpolation, nearest neighbor interpolation, bilinear interpolation, transposed convolution, sub-pixel layer, meta up-sampling for image super-resolution to analyze differences and relations of different CNNs based interpolations and modules, and compare performance of these methods by experiments. Finally, this paper gives potential research points and drawbacks and summarizes the whole paper, which can facilitate developments of CNNs in image super-resolution.
LGNov 30, 2024
Toward Fair Graph Neural Networks Via Dual-Teacher Knowledge DistillationChengyu Li, Debo Cheng, Guixian Zhang et al.
Graph Neural Networks (GNNs) have demonstrated strong performance in graph representation learning across various real-world applications. However, they often produce biased predictions caused by sensitive attributes, such as religion or gender, an issue that has been largely overlooked in existing methods. Recently, numerous studies have focused on reducing biases in GNNs. However, these approaches often rely on training with partial data (e.g., using either node features or graph structure alone), which can enhance fairness but frequently compromises model utility due to the limited utilization of available graph information. To address this tradeoff, we propose an effective strategy to balance fairness and utility in knowledge distillation. Specifically, we introduce FairDTD, a novel Fair representation learning framework built on Dual-Teacher Distillation, leveraging a causal graph model to guide and optimize the design of the distillation process. Specifically, FairDTD employs two fairness-oriented teacher models: a feature teacher and a structure teacher, to facilitate dual distillation, with the student model learning fairness knowledge from the teachers while also leveraging full data to mitigate utility loss. To enhance information transfer, we incorporate graph-level distillation to provide an indirect supplement of graph information during training, as well as a node-specific temperature module to improve the comprehensive transfer of fair knowledge. Experiments on diverse benchmark datasets demonstrate that FairDTD achieves optimal fairness while preserving high model utility, showcasing its effectiveness in fair representation learning for GNNs.
LGNov 26, 2024
Leaning Time-Varying Instruments for Identifying Causal Effects in Time-Series DataDebo Cheng, Ziqi Xu, Jiuyong Li et al.
Querying causal effects from time-series data is important across various fields, including healthcare, economics, climate science, and epidemiology. However, this task becomes complex in the existence of time-varying latent confounders, which affect both treatment and outcome variables over time and can introduce bias in causal effect estimation. Traditional instrumental variable (IV) methods are limited in addressing such complexities due to the need for predefined IVs or strong assumptions that do not hold in dynamic settings. To tackle these issues, we develop a novel Time-varying Conditional Instrumental Variables (CIV) for Debiasing causal effect estimation, referred to as TDCIV. TDCIV leverages Long Short-Term Memory (LSTM) and Variational Autoencoder (VAE) models to disentangle and learn the representations of time-varying CIV and its conditioning set from proxy variables without prior knowledge. Under the assumptions of the Markov property and availability of proxy variables, we theoretically establish the validity of these learned representations for addressing the biases from time-varying latent confounders, thus enabling accurate causal effect estimation. Our proposed TDCIV is the first to effectively learn time-varying CIV and its associated conditioning set without relying on domain-specific knowledge.
SIOct 15, 2024
Towards Fair Graph Representation Learning in Social NetworksGuixian Zhang, Guan Yuan, Debo Cheng et al.
With the widespread use of Graph Neural Networks (GNNs) for representation learning from network data, the fairness of GNN models has raised great attention lately. Fair GNNs aim to ensure that node representations can be accurately classified, but not easily associated with a specific group. Existing advanced approaches essentially enhance the generalisation of node representation in combination with data augmentation strategy, and do not directly impose constraints on the fairness of GNNs. In this work, we identify that a fundamental reason for the unfairness of GNNs in social network learning is the phenomenon of social homophily, i.e., users in the same group are more inclined to congregate. The message-passing mechanism of GNNs can cause users in the same group to have similar representations due to social homophily, leading model predictions to establish spurious correlations with sensitive attributes. Inspired by this reason, we propose a method called Equity-Aware GNN (EAGNN) towards fair graph representation learning. Specifically, to ensure that model predictions are independent of sensitive attributes while maintaining prediction performance, we introduce constraints for fair representation learning based on three principles: sufficiency, independence, and separation. We theoretically demonstrate that our EAGNN method can effectively achieve group fairness. Extensive experiments on three datasets with varying levels of social homophily illustrate that our EAGNN method achieves the state-of-the-art performance across two fairness metrics and offers competitive effectiveness.
IRJan 28, 2022
Hyper-Class Representation of DataShichao Zhang, Jiaye Li, Wenzhen Zhang et al.
Data representation is usually a natural form with their attribute values. On this basis, data processing is an attribute-centered calculation. However, there are three limitations in the attribute-centered calculation, saying, inflexible calculation, preference computation, and unsatisfactory output. To attempt the issues, a new data representation, named as hyper-classes representation, is proposed for improving recommendation. First, the cross entropy, KL divergence and JS divergence of features in data are defined. And then, the hyper-classes in data can be discovered with these three parameters. Finally, a kind of recommendation algorithm is used to evaluate the proposed hyper-class representation of data, and shows that the hyper-class representation is able to provide truly useful reference information for recommendation systems and makes recommendations much better than existing algorithms, i.e., this approach is efficient and promising.
LGMar 17, 2021
Reachable Distance Function for KNN ClassificationShichao Zhang, Jiaye Li, Yangding Li
Distance function is a main metrics of measuring the affinity between two data points in machine learning. Extant distance functions often provide unreachable distance values in real applications. This can lead to incorrect measure of the affinity between data points. This paper proposes a reachable distance function for KNN classification. The reachable distance function is not a geometric direct-line distance between two data points. It gives a consideration to the class attribute of a training dataset when measuring the affinity between data points. Concretely speaking, the reachable distance between data points includes their class center distance and real distance. Its shape looks like "Z", and we also call it a Z distance function. In this way, the affinity between data points in the same class is always stronger than that in different classes. Or, the intraclass data points are always closer than those interclass data points. We evaluated the reachable distance with experiments, and demonstrated that the proposed distance function achieved better performance in KNN classification.
LGDec 9, 2020
KNN Classification with One-step ComputationShichao Zhang, Jiaye Li
KNN classification is an improvisational learning mode, in which they are carried out only when a test data is predicted that set a suitable K value and search the K nearest neighbors from the whole training sample space, referred them to the lazy part of KNN classification. This lazy part has been the bottleneck problem of applying KNN classification due to the complete search of K nearest neighbors. In this paper, a one-step computation is proposed to replace the lazy part of KNN classification. The one-step computation actually transforms the lazy part to a matrix computation as follows. Given a test data, training samples are first applied to fit the test data with the least squares loss function. And then, a relationship matrix is generated by weighting all training samples according to their influence on the test data. Finally, a group lasso is employed to perform sparse learning of the relationship matrix. In this way, setting K value and searching K nearest neighbors are both integrated to a unified computation. In addition, a new classification rule is proposed for improving the performance of one-step KNN classification. The proposed approach is experimentally evaluated, and demonstrated that the one-step KNN classification is efficient and promising
CVJun 5, 2019
PAC-GAN: An Effective Pose Augmentation Scheme for Unsupervised Cross-View Person Re-identificationChengyuan Zhang, Lei Zhu, Shichao Zhang
Person re-identification (person Re-Id) aims to retrieve the pedestrian images of a same person that captured by disjoint and non-overlapping cameras. Lots of researchers recently focuse on this hot issue and propose deep learning based methods to enhance the recognition rate in a supervised or unsupervised manner. However, two limitations that cannot be ignored: firstly, compared with other image retrieval benchmarks, the size of existing person Re-Id datasets are far from meeting the requirement, which cannot provide sufficient pedestrian samples for the training of deep model; secondly, the samples in existing datasets do not have sufficient human motions or postures coverage to provide more priori knowledges for learning. In this paper, we introduce a novel unsupervised pose augmentation cross-view person Re-Id scheme called PAC-GAN to overcome these limitations. We firstly present the formal definition of cross-view pose augmentation and then propose the framework of PAC-GAN that is a novel conditional generative adversarial network (CGAN) based approach to improve the performance of unsupervised corss-view person Re-Id. Specifically, The pose generation model in PAC-GAN called CPG-Net is to generate enough quantity of pose-rich samples from original image and skeleton samples. The pose augmentation dataset is produced by combining the synthesized pose-rich samples with the original samples, which is fed into the corss-view person Re-Id model named Cross-GAN. Besides, we use weight-sharing strategy in the CPG-Net to improve the quality of new generated samples. To the best of our knowledge, we are the first try to enhance the unsupervised cross-view person Re-Id by pose augmentation, and the results of extensive experiments show that the proposed scheme can combat the state-of-the-arts.