Jiawei Chen

IR
h-index29
29papers
2,310citations
Novelty49%
AI Score46

29 Papers

20.6ASMay 30, 2022Code
BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio Synthesis

Yichong Leng, Zehua Chen, Junliang Guo et al. · microsoft-research

Binaural audio plays a significant role in constructing immersive augmented and virtual realities. As it is expensive to record binaural audio from the real world, synthesizing them from mono audio has attracted increasing attention. This synthesis process involves not only the basic physical warping of the mono audio, but also room reverberations and head/ear related filtrations, which, however, are difficult to accurately simulate in traditional digital signal processing. In this paper, we formulate the synthesis process from a different perspective by decomposing the binaural audio into a common part that shared by the left and right channels as well as a specific part that differs in each channel. Accordingly, we propose BinauralGrad, a novel two-stage framework equipped with diffusion models to synthesize them respectively. Specifically, in the first stage, the common information of the binaural audio is generated with a single-channel diffusion model conditioned on the mono audio, based on which the binaural audio is generated by a two-channel diffusion model in the second stage. Combining this novel perspective of two-stage synthesis with advanced generative models (i.e., the diffusion models),the proposed BinauralGrad is able to generate accurate and high-fidelity binaural audio samples. Experiment results show that on a benchmark dataset, BinauralGrad outperforms the existing baselines by a large margin in terms of both object and subject evaluation metrics (Wave L2: 0.128 vs. 0.157, MOS: 3.80 vs. 3.61). The generated audio samples (https://speechresearch.github.io/binauralgrad) and code (https://github.com/microsoft/NeuralSpeech/tree/master/BinauralGrad) are available online.

34.9CVSep 21, 2023Code
How Robust is Google's Bard to Adversarial Image Attacks?

Yinpeng Dong, Huanran Chen, Jiawei Chen et al.

Multimodal Large Language Models (MLLMs) that integrate text and other modalities (especially vision) have achieved unprecedented performance in various multimodal tasks. However, due to the unsolved adversarial robustness problem of vision models, MLLMs can have more severe safety and security risks by introducing the vision inputs. In this work, we study the adversarial robustness of Google's Bard, a competitive chatbot to ChatGPT that released its multimodal capability recently, to better understand the vulnerabilities of commercial MLLMs. By attacking white-box surrogate vision encoders or MLLMs, the generated adversarial examples can mislead Bard to output wrong image descriptions with a 22% success rate based solely on the transferability. We show that the adversarial examples can also attack other MLLMs, e.g., a 26% attack success rate against Bing Chat and a 86% attack success rate against ERNIE bot. Moreover, we identify two defense mechanisms of Bard, including face detection and toxicity detection of images. We design corresponding attacks to evade these defenses, demonstrating that the current defenses of Bard are also vulnerable. We hope this work can deepen our understanding on the robustness of MLLMs and facilitate future research on defenses. Our code is available at https://github.com/thu-ml/Attack-Bard. Update: GPT-4V is available at October 2023. We further evaluate its robustness under the same set of adversarial examples, achieving a 45% attack success rate.

28.9LGJun 15, 2022Code
A Comprehensive Survey on Deep Clustering: Taxonomy, Challenges, and Future Directions

Sheng Zhou, Hongjia Xu, Zhuonan Zheng et al.

Clustering is a fundamental machine learning task which has been widely studied in the literature. Classic clustering methods follow the assumption that data are represented as features in a vectorized form through various representation learning techniques. As the data become increasingly complicated and complex, the shallow (traditional) clustering methods can no longer handle the high-dimensional data type. With the huge success of deep learning, especially the deep unsupervised learning, many representation learning techniques with deep architectures have been proposed in the past decade. Recently, the concept of Deep Clustering, i.e., jointly optimizing the representation learning and clustering, has been proposed and hence attracted growing attention in the community. Motivated by the tremendous success of deep learning in clustering, one of the most fundamental machine learning tasks, and the large number of recent advances in this direction, in this paper we conduct a comprehensive survey on deep clustering by proposing a new taxonomy of different state-of-the-art approaches. We summarize the essential components of deep clustering and categorize existing methods by the ways they design interactions between deep representation learning and clustering. Moreover, this survey also provides the popular benchmark datasets, evaluation metrics and open-source implementations to clearly illustrate various experimental settings. Last but not least, we discuss the practical applications of deep clustering and suggest challenging topics deserving further investigations as future directions.

15.9IRFeb 9, 2023Code
Adap-$τ$: Adaptively Modulating Embedding Magnitude for Recommendation

Jiawei Chen, Junkang Wu, Jiancan Wu et al.

Recent years have witnessed the great successes of embedding-based methods in recommender systems. Despite their decent performance, we argue one potential limitation of these methods -- the embedding magnitude has not been explicitly modulated, which may aggravate popularity bias and training instability, hindering the model from making a good recommendation. It motivates us to leverage the embedding normalization in recommendation. By normalizing user/item embeddings to a specific value, we empirically observe impressive performance gains (9\% on average) on four real-world datasets. Although encouraging, we also reveal a serious limitation when applying normalization in recommendation -- the performance is highly sensitive to the choice of the temperature $τ$ which controls the scale of the normalized embeddings. To fully foster the merits of the normalization while circumvent its limitation, this work studied on how to adaptively set the proper $τ$. Towards this end, we first make a comprehensive analyses of $τ$ to fully understand its role on recommendation. We then accordingly develop an adaptive fine-grained strategy Adap-$τ$ for the temperature with satisfying four desirable properties including adaptivity, personalized, efficiency and model-agnostic. Extensive experiments have been conducted to validate the effectiveness of the proposal. The code is available at \url{https://github.com/junkangwu/Adap_tau}.

14.2IRNov 27, 2022Code
Unbiased Knowledge Distillation for Recommendation

Gang Chen, Jiawei Chen, Fuli Feng et al.

As a promising solution for model compression, knowledge distillation (KD) has been applied in recommender systems (RS) to reduce inference latency. Traditional solutions first train a full teacher model from the training data, and then transfer its knowledge (\ie \textit{soft labels}) to supervise the learning of a compact student model. However, we find such a standard distillation paradigm would incur serious bias issue -- popular items are more heavily recommended after the distillation. This effect prevents the student model from making accurate and fair recommendations, decreasing the effectiveness of RS. In this work, we identify the origin of the bias in KD -- it roots in the biased soft labels from the teacher, and is further propagated and intensified during the distillation. To rectify this, we propose a new KD method with a stratified distillation strategy. It first partitions items into multiple groups according to their popularity, and then extracts the ranking knowledge within each group to supervise the learning of the student. Our method is simple and teacher-agnostic -- it works on distillation stage without affecting the training of the teacher model. We conduct extensive theoretical and empirical studies to validate the effectiveness of our proposal. We release our code at: https://github.com/chengang95/UnKD.

23.7LGJun 17, 2023Code
OpenGSL: A Comprehensive Benchmark for Graph Structure Learning

Zhiyao Zhou, Sheng Zhou, Bochao Mao et al.

Graph Neural Networks (GNNs) have emerged as the de facto standard for representation learning on graphs, owing to their ability to effectively integrate graph topology and node attributes. However, the inherent suboptimal nature of node connections, resulting from the complex and contingent formation process of graphs, presents significant challenges in modeling them effectively. To tackle this issue, Graph Structure Learning (GSL), a family of data-centric learning approaches, has garnered substantial attention in recent years. The core concept behind GSL is to jointly optimize the graph structure and the corresponding GNN models. Despite the proposal of numerous GSL methods, the progress in this field remains unclear due to inconsistent experimental protocols, including variations in datasets, data processing techniques, and splitting strategies. In this paper, we introduce OpenGSL, the first comprehensive benchmark for GSL, aimed at addressing this gap. OpenGSL enables a fair comparison among state-of-the-art GSL methods by evaluating them across various popular datasets using uniform data processing and splitting strategies. Through extensive experiments, we observe that existing GSL methods do not consistently outperform vanilla GNN counterparts. We also find that there is no significant correlation between the homophily of the learned structure and task performance, challenging the common belief. Moreover, we observe that the learned graph structure demonstrates a strong generalization ability across different GNN models, despite the high computational and space consumption. We hope that our open-sourced library will facilitate rapid and equitable evaluation and inspire further innovative research in this field. The code of the benchmark can be found in https://github.com/OpenGSL/OpenGSL.

17.7IRFeb 7, 2023Code
On the Theories Behind Hard Negative Sampling for Recommendation

Wentao Shi, Jiawei Chen, Fuli Feng et al.

Negative sampling has been heavily used to train recommender models on large-scale data, wherein sampling hard examples usually not only accelerates the convergence but also improves the model accuracy. Nevertheless, the reasons for the effectiveness of Hard Negative Sampling (HNS) have not been revealed yet. In this work, we fill the research gap by conducting thorough theoretical analyses on HNS. Firstly, we prove that employing HNS on the Bayesian Personalized Ranking (BPR) learner is equivalent to optimizing One-way Partial AUC (OPAUC). Concretely, the BPR equipped with Dynamic Negative Sampling (DNS) is an exact estimator, while with softmax-based sampling is a soft estimator. Secondly, we prove that OPAUC has a stronger connection with Top-K evaluation metrics than AUC and verify it with simulation experiments. These analyses establish the theoretical foundation of HNS in optimizing Top-K recommendation performance for the first time. On these bases, we offer two insightful guidelines for effective usage of HNS: 1) the sampling hardness should be controllable, e.g., via pre-defined hyper-parameters, to adapt to different Top-K metrics and datasets; 2) the smaller the $K$ we emphasize in Top-K evaluation metrics, the harder the negative samples we should draw. Extensive experiments on three real-world benchmarks verify the two guidelines.

3.8LGFeb 7, 2023
FFHR: Fully and Flexible Hyperbolic Representation for Knowledge Graph Completion

Wentao Shi, Junkang Wu, Xuezhi Cao et al.

Learning hyperbolic embeddings for knowledge graph (KG) has gained increasing attention due to its superiority in capturing hierarchies. However, some important operations in hyperbolic space still lack good definitions, making existing methods unable to fully leverage the merits of hyperbolic space. Specifically, they suffer from two main limitations: 1) existing Graph Convolutional Network (GCN) methods in hyperbolic space rely on tangent space approximation, which would incur approximation error in representation learning, and 2) due to the lack of inner product operation definition in hyperbolic space, existing methods can only measure the plausibility of facts (links) with hyperbolic distance, which is difficult to capture complex data patterns. In this work, we contribute: 1) a Full Poincaré Multi-relational GCN that achieves graph information propagation in hyperbolic space without requiring any approximation, and 2) a hyperbolic generalization of Euclidean inner product that is beneficial to capture both hierarchical and complex patterns. On this basis, we further develop a \textbf{F}ully and \textbf{F}lexible \textbf{H}yperbolic \textbf{R}epresentation framework (\textbf{FFHR}) that is able to transfer recent Euclidean-based advances to hyperbolic space. We demonstrate it by instantiating FFHR with four representative KGC methods. Extensive experiments on benchmark datasets validate the superiority of our FFHRs over their Euclidean counterparts as well as state-of-the-art hyperbolic embedding methods.

13.7LGAug 10, 2023Code
Homophily-enhanced Structure Learning for Graph Clustering

Ming Gu, Gaoming Yang, Sheng Zhou et al.

Graph clustering is a fundamental task in graph analysis, and recent advances in utilizing graph neural networks (GNNs) have shown impressive results. Despite the success of existing GNN-based graph clustering methods, they often overlook the quality of graph structure, which is inherent in real-world graphs due to their sparse and multifarious nature, leading to subpar performance. Graph structure learning allows refining the input graph by adding missing links and removing spurious connections. However, previous endeavors in graph structure learning have predominantly centered around supervised settings, and cannot be directly applied to our specific clustering tasks due to the absence of ground-truth labels. To bridge the gap, we propose a novel method called \textbf{ho}mophily-enhanced structure \textbf{le}arning for graph clustering (HoLe). Our motivation stems from the observation that subtly enhancing the degree of homophily within the graph structure can significantly improve GNNs and clustering outcomes. To realize this objective, we develop two clustering-oriented structure learning modules, i.e., hierarchical correlation estimation and cluster-aware sparsification. The former module enables a more accurate estimation of pairwise node relationships by leveraging guidance from latent and clustering spaces, while the latter one generates a sparsified structure based on the similarity matrix and clustering assignments. Additionally, we devise a joint optimization approach alternating between training the homophily-enhanced structure learning and GNN-based clustering, thereby enforcing their reciprocal effects. Extensive experiments on seven benchmark datasets of various types and scales, across a range of clustering metrics, demonstrate the superiority of HoLe against state-of-the-art baselines.

2.5AIDec 28, 2022
Robust Sequence Networked Submodular Maximization

Qihao Shi, Bingyang Fu, Can Wang et al.

In this paper, we study the \underline{R}obust \underline{o}ptimization for \underline{se}quence \underline{Net}worked \underline{s}ubmodular maximization (RoseNets) problem. We interweave the robust optimization with the sequence networked submodular maximization. The elements are connected by a directed acyclic graph and the objective function is not submodular on the elements but on the edges in the graph. Under such networked submodular scenario, the impact of removing an element from a sequence depends both on its position in the sequence and in the network. This makes the existing robust algorithms inapplicable. In this paper, we take the first step to study the RoseNets problem. We design a robust greedy algorithm, which is robust against the removal of an arbitrary subset of the selected elements. The approximation ratio of the algorithm depends both on the number of the removed elements and the network topology. We further conduct experiments on real applications of recommendation and link prediction. The experimental results demonstrate the effectiveness of the proposed algorithm.

8.8LGDec 20, 2023Code
BSL: Understanding and Improving Softmax Loss for Recommendation

Junkang Wu, Jiawei Chen, Jiancan Wu et al.

Loss functions steer the optimization direction of recommendation models and are critical to model performance, but have received relatively little attention in recent recommendation research. Among various losses, we find Softmax loss (SL) stands out for not only achieving remarkable accuracy but also better robustness and fairness. Nevertheless, the current literature lacks a comprehensive explanation for the efficacy of SL. Toward addressing this research gap, we conduct theoretical analyses on SL and uncover three insights: 1) Optimizing SL is equivalent to performing Distributionally Robust Optimization (DRO) on the negative data, thereby learning against perturbations on the negative distribution and yielding robustness to noisy negatives. 2) Comparing with other loss functions, SL implicitly penalizes the prediction variance, resulting in a smaller gap between predicted values and and thus producing fairer results. Building on these insights, we further propose a novel loss function Bilateral SoftMax Loss (BSL) that extends the advantage of SL to both positive and negative sides. BSL augments SL by applying the same Log-Expectation-Exp structure to positive examples as is used for negatives, making the model robust to the noisy positives as well. Remarkably, BSL is simple and easy-to-implement -- requiring just one additional line of code compared to SL. Experiments on four real-world datasets and three representative backbones demonstrate the effectiveness of our proposal. The code is available at https://github.com/junkangwu/BSL

12.5LGOct 31, 2024Code
PSL: Rethinking and Improving Softmax Loss from Pairwise Perspective for Recommendation

Weiqin Yang, Jiawei Chen, Xin Xin et al.

Softmax Loss (SL) is widely applied in recommender systems (RS) and has demonstrated effectiveness. This work analyzes SL from a pairwise perspective, revealing two significant limitations: 1) the relationship between SL and conventional ranking metrics like DCG is not sufficiently tight; 2) SL is highly sensitive to false negative instances. Our analysis indicates that these limitations are primarily due to the use of the exponential function. To address these issues, this work extends SL to a new family of loss functions, termed Pairwise Softmax Loss (PSL), which replaces the exponential function in SL with other appropriate activation functions. While the revision is minimal, we highlight three merits of PSL: 1) it serves as a tighter surrogate for DCG with suitable activation functions; 2) it better balances data contributions; and 3) it acts as a specific BPR loss enhanced by Distributionally Robust Optimization (DRO). We further validate the effectiveness and robustness of PSL through empirical experiments. The code is available at https://github.com/Tiny-Snow/IR-Benchmark.

11.8IRAug 4, 2025Code
Breaking the Top-$K$ Barrier: Advancing Top-$K$ Ranking Metrics Optimization in Recommender Systems

Weiqin Yang, Jiawei Chen, Shengjia Zhang et al.

In the realm of recommender systems (RS), Top-$K$ ranking metrics such as NDCG@$K$ are the gold standard for evaluating recommendation performance. However, during the training of recommendation models, optimizing NDCG@$K$ poses significant challenges due to its inherent discontinuous nature and the intricate Top-$K$ truncation. Recent efforts to optimize NDCG@$K$ have either overlooked the Top-$K$ truncation or suffered from high computational costs and training instability. To overcome these limitations, we propose SoftmaxLoss@$K$ (SL@$K$), a novel recommendation loss tailored for NDCG@$K$ optimization. Specifically, we integrate the quantile technique to handle Top-$K$ truncation and derive a smooth upper bound for optimizing NDCG@$K$ to address discontinuity. The resulting SL@$K$ loss has several desirable properties, including theoretical guarantees, ease of implementation, computational efficiency, gradient stability, and noise robustness. Extensive experiments on four real-world datasets and three recommendation backbones demonstrate that SL@$K$ outperforms existing losses with a notable average improvement of 6.03%. The code is available at https://github.com/Tiny-Snow/IR-Benchmark.

6.3IRMay 27, 2025Code
Bridging the Gap: Self-Optimized Fine-Tuning for LLM-based Recommender Systems

Heng Tang, Feng Liu, Xinbo Chen et al.

Recent years have witnessed extensive exploration of Large Language Models (LLMs) on the field of Recommender Systems (RS). There are currently two commonly used strategies to enable LLMs to have recommendation capabilities: 1) The "Guidance-Only" strategy uses in-context learning to exploit and amplify the inherent semantic understanding and item recommendation capabilities of LLMs; 2) The "Tuning-Only" strategy uses supervised fine-tuning (SFT) to fine-tune LLMs with the aim of fitting them to real recommendation data. However, neither of these strategies can effectively bridge the gap between the knowledge space of LLMs and recommendation, and their performance do not meet our expectations. To better enable LLMs to learn recommendation knowledge, we combine the advantages of the above two strategies and proposed a novel "Guidance+Tuning" method called Self-Optimized Fine-Tuning (SOFT), which adopts the idea of curriculum learning. It first employs self-distillation to construct an auxiliary easy-to-learn but meaningful dataset from a fine-tuned LLM. Then it further utilizes a self-adaptive curriculum scheduler to enable LLMs to gradually learn from simpler data (self-distilled data) to more challenging data (real RS data). Extensive experiments demonstrate that SOFT significantly enhances the recommendation accuracy (37.59\% on average) of LLM-based methods. The code is available via https://anonymous.4open.science/r/Self-Optimized-Fine-Tuning-264E

15.1CVAug 12, 2021Code
Distilling Holistic Knowledge with Graph Neural Networks

Sheng Zhou, Yucheng Wang, Defang Chen et al.

Knowledge Distillation (KD) aims at transferring knowledge from a larger well-optimized teacher network to a smaller learnable student network.Existing KD methods have mainly considered two types of knowledge, namely the individual knowledge and the relational knowledge. However, these two types of knowledge are usually modeled independently while the inherent correlations between them are largely ignored. It is critical for sufficient student network learning to integrate both individual knowledge and relational knowledge while reserving their inherent correlation. In this paper, we propose to distill the novel holistic knowledge based on an attributed graph constructed among instances. The holistic knowledge is represented as a unified graph-based embedding by aggregating individual knowledge from relational neighborhood samples with graph neural networks, the student network is learned by distilling the holistic knowledge in a contrastive manner. Extensive experiments and ablation studies are conducted on benchmark datasets, the results demonstrate the effectiveness of the proposed method. The code has been published in https://github.com/wyc-ruiker/HKD

21.9LGOct 23, 2020Code
On the Equivalence of Decoupled Graph Convolution Network and Label Propagation

Hande Dong, Jiawei Chen, Fuli Feng et al.

The original design of Graph Convolution Network (GCN) couples feature transformation and neighborhood aggregation for node representation learning. Recently, some work shows that coupling is inferior to decoupling, which supports deep graph propagation better and has become the latest paradigm of GCN (e.g., APPNP and SGCN). Despite effectiveness, the working mechanisms of the decoupled GCN are not well understood. In this paper, we explore the decoupled GCN for semi-supervised node classification from a novel and fundamental perspective -- label propagation. We conduct thorough theoretical analyses, proving that the decoupled GCN is essentially the same as the two-step label propagation: first, propagating the known labels along the graph to generate pseudo-labels for the unlabeled nodes, and second, training normal neural network classifiers on the augmented pseudo-labeled data. More interestingly, we reveal the effectiveness of decoupled GCN: going beyond the conventional label propagation, it could automatically assign structure- and model- aware weights to the pseudo-label data. This explains why the decoupled GCN is relatively robust to the structure noise and over-smoothing, but sensitive to the label noise and model initialization. Based on this insight, we propose a new label propagation method named Propagation then Training Adaptively (PTA), which overcomes the flaws of the decoupled GCN with a dynamic and adaptive weighting strategy. Our PTA is simple yet more effective and robust than decoupled GCN. We empirically validate our findings on four benchmark datasets, demonstrating the advantages of our method. The code is available at https://github.com/DongHande/PT_propagation_then_training.

41.2IROct 7, 2020Code
Bias and Debias in Recommender System: A Survey and Future Directions

Jiawei Chen, Hande Dong, Xiang Wang et al.

While recent years have witnessed a rapid growth of research papers on recommender system (RS), most of the papers focus on inventing machine learning models to better fit user behavior data. However, user behavior data is observational rather than experimental. This makes various biases widely exist in the data, including but not limited to selection bias, position bias, exposure bias, and popularity bias. Blindly fitting the data without considering the inherent biases will result in many serious issues, e.g., the discrepancy between offline evaluation and online metrics, hurting user satisfaction and trust on the recommendation service, etc. To transform the large volume of research models into practical improvements, it is highly urgent to explore the impacts of the biases and perform debiasing when necessary. When reviewing the papers that consider biases in RS, we find that, to our surprise, the studies are rather fragmented and lack a systematic organization. The terminology ``bias'' is widely used in the literature, but its definition is usually vague and even inconsistent across papers. This motivates us to provide a systematic survey of existing work on RS biases. In this paper, we first summarize seven types of biases in recommendation, along with their definitions and characteristics. We then provide a taxonomy to position and organize the existing work on recommendation debiasing. Finally, we identify some open challenges and envision some future directions, with the hope of inspiring more research work on this important yet less investigated topic. The summary of debiasing methods reviewed in this survey can be found at \url{https://github.com/jiawei-chen/RecDebiasing}.

9.2IRFeb 23, 2024Code
EasyRL4Rec: An Easy-to-use Library for Reinforcement Learning Based Recommender Systems

Yuanqing Yu, Chongming Gao, Jiawei Chen et al.

Reinforcement Learning (RL)-Based Recommender Systems (RSs) have gained rising attention for their potential to enhance long-term user engagement. However, research in this field faces challenges, including the lack of user-friendly frameworks, inconsistent evaluation metrics, and difficulties in reproducing existing studies. To tackle these issues, we introduce EasyRL4Rec, an easy-to-use code library designed specifically for RL-based RSs. This library provides lightweight and diverse RL environments based on five public datasets and includes core modules with rich options, simplifying model development. It provides unified evaluation standards focusing on long-term outcomes and offers tailored designs for state modeling and action representation for recommendation scenarios. Furthermore, we share our findings from insightful experiments with current methods. EasyRL4Rec seeks to facilitate the model development and experimental process in the domain of RL-based RSs. The library is available for public use.

3.7CVNov 14, 2024
Prompting the Unseen: Detecting Hidden Backdoors in Black-Box Models

Zi-Xuan Huang, Jia-Wei Chen, Zhi-Peng Zhang et al.

Visual prompting (VP) is a new technique that adapts well-trained frozen models for source domain tasks to target domain tasks. This study examines VP's benefits for black-box model-level backdoor detection. The visual prompt in VP maps class subspaces between source and target domains. We identify a misalignment, termed class subspace inconsistency, between clean and poisoned datasets. Based on this, we introduce \textsc{BProm}, a black-box model-level detection method to identify backdoors in suspicious models, if any. \textsc{BProm} leverages the low classification accuracy of prompted models when backdoors are present. Extensive experiments confirm \textsc{BProm}'s effectiveness.

4.6LGJun 13, 2024Code
Motif-driven Subgraph Structure Learning for Graph Classification

Zhiyao Zhou, Sheng Zhou, Bochao Mao et al.

To mitigate the suboptimal nature of graph structure, Graph Structure Learning (GSL) has emerged as a promising approach to improve graph structure and boost performance in downstream tasks. Despite the proposal of numerous GSL methods, the progresses in this field mostly concentrated on node-level tasks, while graph-level tasks (e.g., graph classification) remain largely unexplored. Notably, applying node-level GSL to graph classification is non-trivial due to the lack of find-grained guidance for intricate structure learning. Inspired by the vital role of subgraph in graph classification, in this paper we explore the potential of subgraph structure learning for graph classification by tackling the challenges of key subgraph selection and structure optimization. We propose a novel Motif-driven Subgraph Structure Learning method for Graph Classification (MOSGSL). Specifically, MOSGSL incorporates a subgraph structure learning module which can adaptively select important subgraphs. A motif-driven structure guidance module is further introduced to capture key subgraph-level structural patterns (motifs) and facilitate personalized structure learning. Extensive experiments demonstrate a significant and consistent improvement over baselines, as well as its flexibility and generalizability for various backbones and learning procedures.

10.2AIJan 17, 2022
Neighboring Backdoor Attacks on Graph Convolutional Network

Liang Chen, Qibiao Peng, Jintang Li et al.

Backdoor attacks have been widely studied to hide the misclassification rules in the normal models, which are only activated when the model is aware of the specific inputs (i.e., the trigger). However, despite their success in the conventional Euclidean space, there are few studies of backdoor attacks on graph structured data. In this paper, we propose a new type of backdoor which is specific to graph data, called neighboring backdoor. Considering the discreteness of graph data, how to effectively design the triggers while retaining the model accuracy on the original task is the major challenge. To address such a challenge, we set the trigger as a single node, and the backdoor is activated when the trigger node is connected to the target node. To preserve the model accuracy, the model parameters are not allowed to be modified. Thus, when the trigger node is not connected, the model performs normally. Under these settings, in this work, we focus on generating the features of the trigger node. Two types of backdoors are proposed: (1) Linear Graph Convolution Backdoor which finds an approximation solution for the feature generation (can be viewed as an integer programming problem) by looking at the linear part of GCNs. (2) Variants of existing graph attacks. We extend current gradient-based attack methods to our backdoor attack scenario. Extensive experiments on two social networks and two citation networks datasets demonstrate that all proposed backdoors can achieve an almost 100\% attack success rate while having no impact on predictive accuracy.

19.2IRSep 16, 2021
Popularity Bias Is Not Always Evil: Disentangling Benign and Harmful Bias for Recommendation

Zihao Zhao, Jiawei Chen, Sheng Zhou et al.

Recommender system usually suffers from severe popularity bias -- the collected interaction data usually exhibits quite imbalanced or even long-tailed distribution over items. Such skewed distribution may result from the users' conformity to the group, which deviates from reflecting users' true preference. Existing efforts for tackling this issue mainly focus on completely eliminating popularity bias. However, we argue that not all popularity bias is evil. Popularity bias not only results from conformity but also item quality, which is usually ignored by existing methods. Some items exhibit higher popularity as they have intrinsic better property. Blindly removing the popularity bias would lose such important signal, and further deteriorate model performance. To sufficiently exploit such important information for recommendation, it is essential to disentangle the benign popularity bias caused by item quality from the harmful popularity bias caused by conformity. Although important, it is quite challenging as we lack an explicit signal to differentiate the two factors of popularity bias. In this paper, we propose to leverage temporal information as the two factors exhibit quite different patterns along the time: item quality revealing item inherent property is stable and static while conformity that depends on items' recent clicks is highly time-sensitive. Correspondingly, we further propose a novel Time-aware DisEntangled framework (TIDE), where a click is generated from three components namely the static item quality, the dynamic conformity effect, as well as the user-item matching score returned by any recommendation model. Lastly, we conduct interventional inference such that the recommendation can benefit from the benign popularity bias while circumvent the harmful one. Extensive experiments on three real-world datasets demonstrated the effectiveness of TIDE.

13.9IRAug 5, 2021
Time-aware Path Reasoning on Knowledge Graph for Recommendation

Yuyue Zhao, Xiang Wang, Jiawei Chen et al.

Reasoning on knowledge graph (KG) has been studied for explainable recommendation due to it's ability of providing explicit explanations. However, current KG-based explainable recommendation methods unfortunately ignore the temporal information (such as purchase time, recommend time, etc.), which may result in unsuitable explanations. In this work, we propose a novel Time-aware Path reasoning for Recommendation (TPRec for short) method, which leverages the potential of temporal information to offer better recommendation with plausible explanations. First, we present an efficient time-aware interaction relation extraction component to construct collaborative knowledge graph with time-aware interactions (TCKG for short), and then introduce a novel time-aware path reasoning method for recommendation. We conduct extensive experiments on three real-world datasets. The results demonstrate that the proposed TPRec could successfully employ TCKG to achieve substantial gains and improve the quality of explainable recommendation.

6.3IRMay 28, 2021
CausCF: Causal Collaborative Filtering for RecommendationEffect Estimation

Xu Xie, Zhaoyang Liu, Shiwen Wu et al.

To improve user experience and profits of corporations, modern industrial recommender systems usually aim to select the items that are most likely to be interacted with (e.g., clicks and purchases). However, they overlook the fact that users may purchase the items even without recommendations. To select these effective items, it is essential to estimate the causal effect of recommendations. The real effective items are the ones which can contribute to purchase probability uplift. Nevertheless, it is difficult to obtain the real causal effect since we can only recommend or not recommend an item to a user at one time. Furthermore, previous works usually rely on the randomized controlled trial~(RCT) experiment to evaluate their performance. However, it is usually not practicable in the recommendation scenario due to its unavailable time consuming. To tackle these problems, in this paper, we propose a causal collaborative filtering~(CausCF) method inspired by the widely adopted collaborative filtering~(CF) technique. It is based on the idea that similar users not only have a similar taste on items, but also have similar treatment effect under recommendations. CausCF extends the classical matrix factorization to the tensor factorization with three dimensions -- user, item, and treatment. Furthermore, we also employs regression discontinuity design (RDD) to evaluate the precision of the estimated causal effects from different models. With the testable assumptions, RDD analysis can provide an unbiased causal conclusion without RCT experiments. Through dedicated experiments on both the public datasets and the industrial application, we demonstrate the effectiveness of our proposed CausCF on the causal effect estimation and ranking performance improvement.

6.4IRNov 16, 2020
CoSam: An Efficient Collaborative Adaptive Sampler for Recommendation

Jiawei Chen, Chengquan Jiang, Can Wang et al.

Sampling strategies have been widely applied in many recommendation systems to accelerate model learning from implicit feedback data. A typical strategy is to draw negative instances with uniform distribution, which however will severely affect model's convergency, stability, and even recommendation accuracy. A promising solution for this problem is to over-sample the ``difficult'' (a.k.a informative) instances that contribute more on training. But this will increase the risk of biasing the model and leading to non-optimal results. Moreover, existing samplers are either heuristic, which require domain knowledge and often fail to capture real ``difficult'' instances; or rely on a sampler model that suffers from low efficiency. To deal with these problems, we propose an efficient and effective collaborative sampling method CoSam, which consists of: (1) a collaborative sampler model that explicitly leverages user-item interaction information in sampling probability and exhibits good properties of normalization, adaption, interaction information awareness, and sampling efficiency; and (2) an integrated sampler-recommender framework, leveraging the sampler model in prediction to offset the bias caused by uneven sampling. Correspondingly, we derive a fast reinforced training algorithm of our framework to boost the sampler performance and sampler-recommender collaboration. Extensive experiments on four real-world datasets demonstrate the superiority of the proposed collaborative sampler model and integrated sampler-recommender framework.

6.4IRNov 16, 2020Code
SamWalker++: recommendation with informative sampling strategy

Can Wang, Jiawei Chen, Sheng Zhou et al.

Recommendation from implicit feedback is a highly challenging task due to the lack of reliable negative feedback data. Existing methods address this challenge by treating all the un-observed data as negative (dislike) but downweight the confidence of these data. However, this treatment causes two problems: (1) Confidence weights of the unobserved data are usually assigned manually, which lack flexibility and may create empirical bias on evaluating user's preference. (2) To handle massive volume of the unobserved feedback data, most of the existing methods rely on stochastic inference and data sampling strategies. However, since a user is only aware of a very small fraction of items in a large dataset, it is difficult for existing samplers to select informative training instances in which the user really dislikes the item rather than does not know it. To address the above two problems, we propose two novel recommendation methods SamWalker and SamWalker++ that support both adaptive confidence assignment and efficient model learning. SamWalker models data confidence with a social network-aware function, which can adaptively specify different weights to different data according to users' social contexts. However, the social network information may not be available in many recommender systems, which hinders application of SamWalker. Thus, we further propose SamWalker++, which does not require any side information and models data confidence with a constructed pseudo-social network. We also develop fast random-walk-based sampling strategies for our SamWalker and SamWalker++ to adaptively draw informative training instances, which can speed up gradient estimation and reduce sampling variance. Extensive experiments on five real-world datasets demonstrate the superiority of the proposed SamWalker and SamWalker++.

20.6ASSep 3, 2020Code
HiFiSinger: Towards High-Fidelity Neural Singing Voice Synthesis

Jiawei Chen, Xu Tan, Jian Luan et al.

High-fidelity singing voices usually require higher sampling rate (e.g., 48kHz) to convey expression and emotion. However, higher sampling rate causes the wider frequency band and longer waveform sequences and throws challenges for singing voice synthesis (SVS) in both frequency and time domains. Conventional SVS systems that adopt small sampling rate cannot well address the above challenges. In this paper, we develop HiFiSinger, an SVS system towards high-fidelity singing voice. HiFiSinger consists of a FastSpeech based acoustic model and a Parallel WaveGAN based vocoder to ensure fast training and inference and also high voice quality. To tackle the difficulty of singing modeling caused by high sampling rate (wider frequency band and longer waveform), we introduce multi-scale adversarial training in both the acoustic model and vocoder to improve singing modeling. Specifically, 1) To handle the larger range of frequencies caused by higher sampling rate, we propose a novel sub-frequency GAN (SF-GAN) on mel-spectrogram generation, which splits the full 80-dimensional mel-frequency into multiple sub-bands and models each sub-band with a separate discriminator. 2) To model longer waveform sequences caused by higher sampling rate, we propose a multi-length GAN (ML-GAN) for waveform generation to model different lengths of waveform sequences with separate discriminators. 3) We also introduce several additional designs and findings in HiFiSinger that are crucial for high-fidelity voices, such as adding F0 (pitch) and V/UV (voiced/unvoiced flag) as acoustic features, choosing an appropriate window/hop size for mel-spectrogram, and increasing the receptive field in vocoder for long vowel modeling. Experiment results show that HiFiSinger synthesizes high-fidelity singing voices with much higher quality: 0.32/0.44 MOS gain over 48kHz/24kHz baseline and 0.83 MOS gain over previous SVS systems.

16.1IRMar 4, 2020
Fast Adaptively Weighted Matrix Factorization for Recommendation with Implicit Feedback

Jiawei Chen, Can Wang, Sheng Zhou et al.

Recommendation from implicit feedback is a highly challenging task due to the lack of the reliable observed negative data. A popular and effective approach for implicit recommendation is to treat unobserved data as negative but downweight their confidence. Naturally, how to assign confidence weights and how to handle the large number of the unobserved data are two key problems for implicit recommendation models. However, existing methods either pursuit fast learning by manually assigning simple confidence weights, which lacks flexibility and may create empirical bias in evaluating user's preference; or adaptively infer personalized confidence weights but suffer from low efficiency. To achieve both adaptive weights assignment and efficient model learning, we propose a fast adaptively weighted matrix factorization (FAWMF) based on variational auto-encoder. The personalized data confidence weights are adaptively assigned with a parameterized neural network (function) and the network can be inferred from the data. Further, to support fast and stable learning of FAWMF, a new specific batch-based learning algorithm fBGD has been developed, which trains on all feedback data but its complexity is linear to the number of observed data. Extensive experiments on real-world datasets demonstrate the superiority of the proposed FAWMF and its learning algorithm fBGD.

7.3SIJan 31, 2019Code
HAHE: Hierarchical Attentive Heterogeneous Information Network Embedding

Sheng Zhou, Jiajun Bu, Xin Wang et al.

Heterogeneous information network (HIN) embedding has recently attracted much attention due to its effectiveness in dealing with the complex heterogeneous data. Meta path, which connects different object types with various semantic meanings, is widely used by existing HIN embedding works. However, several challenges have not been addressed so far. First, different meta paths convey different semantic meanings, while existing works assume that all nodes share same weights for meta paths and ignore the personalized preferences of different nodes on different meta paths. Second, given a meta path, nodes in HIN are connected by path instances while existing works fail to fully explore the differences between path instances that reflect nodes' preferences in the semantic space. rTo tackle the above challenges, we propose aHierarchical Attentive Heterogeneous information network Embedding (HAHE) model to capture the personalized preferences on meta paths and path instances in each semantic space. As path instances are based on a particular meta path, a hierarchical attention mechanism is naturally utilized to model the personalized preference on meta paths and path instances. Extensive experiments on several real-world datasets show that our proposed \model model significantly outperforms the state-of-the-art methods in terms of various data mining tasks.