Zhengwei Wu

LG
h-index19
7papers
96citations
Novelty55%
AI Score32

7 Papers

IRSep 20, 2023Code
Long-tail Augmented Graph Contrastive Learning for Recommendation

Qian Zhao, Zhengwei Wu, Zhiqiang Zhang et al.

Graph Convolutional Networks (GCNs) has demonstrated promising results for recommender systems, as they can effectively leverage high-order relationship. However, these methods usually encounter data sparsity issue in real-world scenarios. To address this issue, GCN-based recommendation methods employ contrastive learning to introduce self-supervised signals. Despite their effectiveness, these methods lack consideration of the significant degree disparity between head and tail nodes. This can lead to non-uniform representation distribution, which is a crucial factor for the performance of contrastive learning methods. To tackle the above issue, we propose a novel Long-tail Augmented Graph Contrastive Learning (LAGCL) method for recommendation. Specifically, we introduce a learnable long-tail augmentation approach to enhance tail nodes by supplementing predicted neighbor information, and generate contrastive views based on the resulting augmented graph. To make the data augmentation schema learnable, we design an auto drop module to generate pseudo-tail nodes from head nodes and a knowledge transfer module to reconstruct the head nodes from pseudo-tail nodes. Additionally, we employ generative adversarial networks to ensure that the distribution of the generated tail/head nodes matches that of the original tail/head nodes. Extensive experiments conducted on three benchmark datasets demonstrate the significant improvement in performance of our model over the state-of-the-arts. Further analyses demonstrate the uniformity of learned representations and the superiority of LAGCL on long-tail performance. Code is publicly available at https://github.com/im0qianqian/LAGCL

IRMar 22, 2024
Leave No One Behind: Online Self-Supervised Self-Distillation for Sequential Recommendation

Shaowei Wei, Zhengwei Wu, Xin Li et al.

Sequential recommendation methods play a pivotal role in modern recommendation systems. A key challenge lies in accurately modeling user preferences in the face of data sparsity. To tackle this challenge, recent methods leverage contrastive learning (CL) to derive self-supervision signals by maximizing the mutual information of two augmented views of the original user behavior sequence. Despite their effectiveness, CL-based methods encounter a limitation in fully exploiting self-supervision signals for users with limited behavior data, as users with extensive behaviors naturally offer more information. To address this problem, we introduce a novel learning paradigm, named Online Self-Supervised Self-distillation for Sequential Recommendation ($S^4$Rec), effectively bridging the gap between self-supervised learning and self-distillation methods. Specifically, we employ online clustering to proficiently group users by their distinct latent intents. Additionally, an adversarial learning strategy is utilized to ensure that the clustering procedure is not affected by the behavior length factor. Subsequently, we employ self-distillation to facilitate the transfer of knowledge from users with extensive behaviors (teachers) to users with limited behaviors (students). Experiments conducted on four real-world datasets validate the effectiveness of the proposed method.

LGDec 5, 2024
Graph Disentangle Causal Model: Enhancing Causal Inference in Networked Observational Data

Binbin Hu, Zhicheng An, Zhengwei Wu et al.

Estimating individual treatment effects (ITE) from observational data is a critical task across various domains. However, many existing works on ITE estimation overlook the influence of hidden confounders, which remain unobserved at the individual unit level. To address this limitation, researchers have utilized graph neural networks to aggregate neighbors' features to capture the hidden confounders and mitigate confounding bias by minimizing the discrepancy of confounder representations between the treated and control groups. Despite the success of these approaches, practical scenarios often treat all features as confounders and involve substantial differences in feature distributions between the treated and control groups. Confusing the adjustment and confounder and enforcing strict balance on the confounder representations could potentially undermine the effectiveness of outcome prediction. To mitigate this issue, we propose a novel framework called the \textit{Graph Disentangle Causal model} (GDC) to conduct ITE estimation in the network setting. GDC utilizes a causal disentangle module to separate unit features into adjustment and confounder representations. Then we design a graph aggregation module consisting of three distinct graph aggregators to obtain adjustment, confounder, and counterfactual confounder representations. Finally, a causal constraint module is employed to enforce the disentangled representations as true causal factors. The effectiveness of our proposed method is demonstrated by conducting comprehensive experiments on two networked datasets.

LGJun 10, 2020
Bandit Samplers for Training Graph Neural Networks

Ziqi Liu, Zhengwei Wu, Zhiqiang Zhang et al.

Several sampling algorithms with variance reduction have been proposed for accelerating the training of Graph Convolution Networks (GCNs). However, due to the intractable computation of optimal sampling distribution, these sampling algorithms are suboptimal for GCNs and are not applicable to more general graph neural networks (GNNs) where the message aggregator contains learned weights rather than fixed weights, such as Graph Attention Networks (GAT). The fundamental reason is that the embeddings of the neighbors or learned weights involved in the optimal sampling distribution are changing during the training and not known a priori, but only partially observed when sampled, thus making the derivation of an optimal variance reduced samplers non-trivial. In this paper, we formulate the optimization of the sampling variance as an adversary bandit problem, where the rewards are related to the node embeddings and learned weights, and can vary constantly. Thus a good sampler needs to acquire variance information about more neighbors (exploration) while at the same time optimizing the immediate sampling variance (exploit). We theoretically show that our algorithm asymptotically approaches the optimal variance within a factor of 3. We show the efficiency and effectiveness of our approach on multiple datasets.

AIFeb 2, 2019
Belief dynamics extraction

Arun Kumar, Zhengwei Wu, Xaq Pitkow et al.

Animal behavior is not driven simply by its current observations, but is strongly influenced by internal states. Estimating the structure of these internal states is crucial for understanding the neural basis of behavior. In principle, internal states can be estimated by inverting behavior models, as in inverse model-based Reinforcement Learning. However, this requires careful parameterization and risks model-mismatch to the animal. Here we take a data-driven approach to infer latent states directly from observations of behavior, using a partially observable switching semi-Markov process. This process has two elements critical for capturing animal behavior: it captures non-exponential distribution of times between observations, and transitions between latent states depend on the animal's actions, features that require more complex non-markovian models to represent. To demonstrate the utility of our approach, we apply it to the observations of a simulated optimal agent performing a foraging task, and find that latent dynamics extracted by the model has correspondences with the belief dynamics of the agent. Finally, we apply our model to identify latent states in the behaviors of monkey performing a foraging task, and find clusters of latent states that identify periods of time consistent with expectant waiting. This data-driven behavioral model will be valuable for inferring latent cognitive states, and thereby for measuring neural representations of those states.

LGMay 24, 2018
Inverse Rational Control: Inferring What You Think from How You Forage

Zhengwei Wu, Paul Schrater, Xaq Pitkow

Complex behaviors are often driven by an internal model, which integrates sensory information over time and facilitates long-term planning. Inferring an agent's internal model is a crucial ingredient in social interactions (theory of mind), for imitation learning, and for interpreting neural activities of behaving agents. Here we describe a generic method to model an agent's behavior under an environment with uncertainty, and infer the agent's internal model, reward function, and dynamic beliefs. We apply our method to a simulated agent performing a naturalistic foraging task. We assume the agent behaves rationally --- that is, they take actions that optimize their subjective utility according to their understanding of the task and its relevant causal variables. We model this rational solution as a Partially Observable Markov Decision Process (POMDP) where the agent may make wrong assumptions about the task parameters. Given the agent's sensory observations and actions, we learn its internal model and reward function by maximum likelihood estimation over a set of task-relevant parameters. The Markov property of the POMDP enables us to characterize the transition probabilities between internal belief states and iteratively estimate the agent's policy using a constrained Expectation-Maximization (EM) algorithm. We validate our method on simulated agents performing suboptimally on a foraging task currently used in many neuroscience experiments, and successfully recover their internal model and reward function. Our work lays a critical foundation to discover how the brain represents and computes with dynamic beliefs.

AIJun 12, 2016
Store Location Selection via Mining Search Query Logs of Baidu Maps

Mengwen Xu, Tianyi Wang, Zhengwei Wu et al.

Choosing a good location when opening a new store is crucial for the future success of a business. Traditional methods include offline manual survey, which is very time consuming, and analytic models based on census data, which are un- able to adapt to the dynamic market. The rapid increase of the availability of big data from various types of mobile devices, such as online query data and offline positioning data, provides us with the possibility to develop automatic and accurate data-driven prediction models for business store placement. In this paper, we propose a Demand Distribution Driven Store Placement (D3SP) framework for business store placement by mining search query data from Baidu Maps. D3SP first detects the spatial-temporal distributions of customer demands on different business services via query data from Baidu Maps, the largest online map search engine in China, and detects the gaps between demand and sup- ply. Then we determine candidate locations via clustering such gaps. In the final stage, we solve the location optimization problem by predicting and ranking the number of customers. We not only deploy supervised regression models to predict the number of customers, but also learn to rank models to directly rank the locations. We evaluate our framework on various types of businesses in real-world cases, and the experiments results demonstrate the effectiveness of our methods. D3SP as the core function for store placement has already been implemented as a core component of our business analytics platform and could be potentially used by chain store merchants on Baidu Nuomi.