Depeng Jin

IR
h-index63
46papers
3,968citations
Novelty52%
AI Score54

46 Papers

IRNov 14, 2023Code
Mixed Attention Network for Cross-domain Sequential Recommendation

Guanyu Lin, Chen Gao, Yu Zheng et al.

In modern recommender systems, sequential recommendation leverages chronological user behaviors to make effective next-item suggestions, which suffers from data sparsity issues, especially for new users. One promising line of work is the cross-domain recommendation, which trains models with data across multiple domains to improve the performance in data-scarce domains. Recent proposed cross-domain sequential recommendation models such as PiNet and DASL have a common drawback relying heavily on overlapped users in different domains, which limits their usage in practical recommender systems. In this paper, we propose a Mixed Attention Network (MAN) with local and global attention modules to extract the domain-specific and cross-domain information. Firstly, we propose a local/global encoding layer to capture the domain-specific/cross-domain sequential pattern. Then we propose a mixed attention layer with item similarity attention, sequence-fusion attention, and group-prototype attention to capture the local/global item similarity, fuse the local/global item sequence, and extract the user groups across different domains, respectively. Finally, we propose a local/global prediction layer to further evolve and combine the domain-specific and cross-domain interests. Experimental results on two real-world datasets (each with two domains) demonstrate the superiority of our proposed model. Further study also illustrates that our proposed method and components are model-agnostic and effective, respectively. The code and data are available at https://github.com/Guanyu-Lin/MAN.

LGFeb 9, 2023Code
Learning to Simulate Daily Activities via Modeling Dynamic Human Needs

Yuan Yuan, Huandong Wang, Jingtao Ding et al.

Daily activity data that records individuals' various types of activities in daily life are widely used in many applications such as activity scheduling, activity recommendation, and policymaking. Though with high value, its accessibility is limited due to high collection costs and potential privacy issues. Therefore, simulating human activities to produce massive high-quality data is of great importance to benefit practical applications. However, existing solutions, including rule-based methods with simplified assumptions of human behavior and data-driven methods directly fitting real-world data, both cannot fully qualify for matching reality. In this paper, motivated by the classic psychological theory, Maslow's need theory describing human motivation, we propose a knowledge-driven simulation framework based on generative adversarial imitation learning. To enhance the fidelity and utility of the generated activity data, our core idea is to model the evolution of human needs as the underlying mechanism that drives activity generation in the simulation model. Specifically, this is achieved by a hierarchical model structure that disentangles different need levels, and the use of neural stochastic differential equations that successfully captures piecewise-continuous characteristics of need dynamics. Extensive experiments demonstrate that our framework outperforms the state-of-the-art baselines in terms of data fidelity and utility. Besides, we present the insightful interpretability of the need modeling. The code is available at https://github.com/tsinghua-fib-lab/SAND.

AISep 19, 2023Code
Towards Generative Modeling of Urban Flow through Knowledge-enhanced Denoising Diffusion

Zhilun Zhou, Jingtao Ding, Yu Liu et al.

Although generative AI has been successful in many areas, its ability to model geospatial data is still underexplored. Urban flow, a typical kind of geospatial data, is critical for a wide range of urban applications. Existing studies mostly focus on predictive modeling of urban flow that predicts the future flow based on historical flow data, which may be unavailable in data-sparse areas or newly planned regions. Some other studies aim to predict OD flow among regions but they fail to model dynamic changes of urban flow over time. In this work, we study a new problem of urban flow generation that generates dynamic urban flow for regions without historical flow data. To capture the effect of multiple factors on urban flow, such as region features and urban environment, we employ diffusion model to generate urban flow for regions under different conditions. We first construct an urban knowledge graph (UKG) to model the urban environment and relationships between regions, based on which we design a knowledge-enhanced spatio-temporal diffusion model (KSTDiff) to generate urban flow for each region. Specifically, to accurately generate urban flow for regions with different flow volumes, we design a novel diffusion process guided by a volume estimator, which is learnable and customized for each region. Moreover, we propose a knowledge-enhanced denoising network to capture the spatio-temporal dependencies of urban flow as well as the impact of urban environment in the denoising process. Extensive experiments on four real-world datasets validate the superiority of our model over state-of-the-art baselines in urban flow generation. Further in-depth studies demonstrate the utility of generated urban flow data and the ability of our model for long-term flow generation and urban flow prediction. Our code is released at: https://github.com/tsinghua-fib-lab/KSTDiff-Urban-flow-generation.

LGAug 14, 2022
DisenHCN: Disentangled Hypergraph Convolutional Networks for Spatiotemporal Activity Prediction

Yinfeng Li, Chen Gao, Quanming Yao et al. · tsinghua

Spatiotemporal activity prediction, aiming to predict user activities at a specific location and time, is crucial for applications like urban planning and mobile advertising. Existing solutions based on tensor decomposition or graph embedding suffer from the following two major limitations: 1) ignoring the fine-grained similarities of user preferences; 2) user's modeling is entangled. In this work, we propose a hypergraph neural network model called DisenHCN to bridge the above gaps. In particular, we first unify the fine-grained user similarity and the complex matching between user preferences and spatiotemporal activity into a heterogeneous hypergraph. We then disentangle the user representations into different aspects (location-aware, time-aware, and activity-aware) and aggregate corresponding aspect's features on the constructed hypergraph, capturing high-order relations from different aspects and disentangles the impact of each aspect for final prediction. Extensive experiments show that our DisenHCN outperforms the state-of-the-art methods by 14.23% to 18.10% on four real-world datasets. Further studies also convincingly verify the rationality of each component in our DisenHCN.

CLOct 16, 2023
Stance Detection with Collaborative Role-Infused LLM-Based Agents

Xiaochong Lan, Chen Gao, Depeng Jin et al.

Stance detection automatically detects the stance in a text towards a target, vital for content analysis in web and social media research. Despite their promising capabilities, LLMs encounter challenges when directly applied to stance detection. First, stance detection demands multi-aspect knowledge, from deciphering event-related terminologies to understanding the expression styles in social media platforms. Second, stance detection requires advanced reasoning to infer authors' implicit viewpoints, as stance are often subtly embedded rather than overtly stated in the text. To address these challenges, we design a three-stage framework COLA (short for Collaborative rOle-infused LLM-based Agents) in which LLMs are designated distinct roles, creating a collaborative system where each role contributes uniquely. Initially, in the multidimensional text analysis stage, we configure the LLMs to act as a linguistic expert, a domain specialist, and a social media veteran to get a multifaceted analysis of texts, thus overcoming the first challenge. Next, in the reasoning-enhanced debating stage, for each potential stance, we designate a specific LLM-based agent to advocate for it, guiding the LLM to detect logical connections between text features and stance, tackling the second challenge. Finally, in the stance conclusion stage, a final decision maker agent consolidates prior insights to determine the stance. Our approach avoids extra annotated data and model training and is highly usable. We achieve state-of-the-art performance across multiple datasets. Ablation studies validate the effectiveness of each design role in handling stance detection. Further experiments have demonstrated the explainability and the versatility of our approach. Our approach excels in usability, accuracy, effectiveness, explainability and versatility, highlighting its value.

LGSep 3, 2024Code
Large-scale Urban Facility Location Selection with Knowledge-informed Reinforcement Learning

Hongyuan Su, Yu Zheng, Jingtao Ding et al.

The facility location problem (FLP) is a classical combinatorial optimization challenge aimed at strategically laying out facilities to maximize their accessibility. In this paper, we propose a reinforcement learning method tailored to solve large-scale urban FLP, capable of producing near-optimal solutions at superfast inference speed. We distill the essential swap operation from local search, and simulate it by intelligently selecting edges on a graph of urban regions, guided by a knowledge-informed graph neural network, thus sidestepping the need for heavy computation of local search. Extensive experiments on four US cities with different geospatial conditions demonstrate that our approach can achieve comparable performance to commercial solvers with less than 5\% accessibility loss, while displaying up to 1000 times speedup. We deploy our model as an online geospatial application at https://huggingface.co/spaces/randommmm/MFLP.

AIFeb 22, 2023
Advancements in Federated Learning: Models, Methods, and Privacy

Huiming Chen, Huandong Wang, Qingyue Long et al.

Federated learning (FL) is a promising technique for addressing the rising privacy and security issues. Its main ingredient is to cooperatively learn the model among the distributed clients without uploading any sensitive data. In this paper, we conducted a thorough review of the related works, following the development context and deeply mining the key technologies behind FL from both theoretical and practical perspectives. Specifically, we first classify the existing works in FL architecture based on the network topology of FL systems with detailed analysis and summarization. Next, we abstract the current application problems, summarize the general techniques and frame the application problems into the general paradigm of FL base models. Moreover, we provide our proposed solutions for model training via FL. We have summarized and analyzed the existing FedOpt algorithms, and deeply revealed the algorithmic development principles of many first-order algorithms in depth, proposing a more generalized algorithm design framework. Based on these frameworks, we have instantiated FedOpt algorithms. As privacy and security is the fundamental requirement in FL, we provide the existing attack scenarios and the defense methods. To the best of our knowledge, we are among the first tier to review the theoretical methodology and propose our strategies since there are very few works surveying the theoretical approaches. Our survey targets motivating the development of high-performance, privacy-preserving, and secure methods to integrate FL into real-world applications.

LGJul 23, 2024
PateGail: A Privacy-Preserving Mobility Trajectory Generator with Imitation Learning

Huandong Wang, Changzheng Gao, Yuchen Wu et al.

Generating human mobility trajectories is of great importance to solve the lack of large-scale trajectory data in numerous applications, which is caused by privacy concerns. However, existing mobility trajectory generation methods still require real-world human trajectories centrally collected as the training data, where there exists an inescapable risk of privacy leakage. To overcome this limitation, in this paper, we propose PateGail, a privacy-preserving imitation learning model to generate mobility trajectories, which utilizes the powerful generative adversary imitation learning model to simulate the decision-making process of humans. Further, in order to protect user privacy, we train this model collectively based on decentralized mobility data stored in user devices, where personal discriminators are trained locally to distinguish and reward the real and generated human trajectories. In the training process, only the generated trajectories and their rewards obtained based on personal discriminators are shared between the server and devices, whose privacy is further preserved by our proposed perturbation mechanisms with theoretical proof to satisfy differential privacy. Further, to better model the human decision-making process, we propose a novel aggregation mechanism of the rewards obtained from personal discriminators. We theoretically prove that under the reward obtained based on the aggregation mechanism, our proposed model maximizes the lower bound of the discounted total rewards of users. Extensive experiments show that the trajectories generated by our model are able to resemble real-world trajectories in terms of five key statistical metrics, outperforming state-of-the-art algorithms by over 48.03%. Furthermore, we demonstrate that the synthetic trajectories are able to efficiently support practical applications, including mobility prediction and location recommendation.

SYJun 17, 2023
Multi-Scale Simulation of Complex Systems: A Perspective of Integrating Knowledge and Data

Huandong Wang, Huan Yan, Can Rong et al.

Complex system simulation has been playing an irreplaceable role in understanding, predicting, and controlling diverse complex systems. In the past few decades, the multi-scale simulation technique has drawn increasing attention for its remarkable ability to overcome the challenges of complex system simulation with unknown mechanisms and expensive computational costs. In this survey, we will systematically review the literature on multi-scale simulation of complex systems from the perspective of knowledge and data. Firstly, we will present background knowledge about simulating complex system simulation and the scales in complex systems. Then, we divide the main objectives of multi-scale modeling and simulation into five categories by considering scenarios with clear scale and scenarios with unclear scale, respectively. After summarizing the general methods for multi-scale simulation based on the clues of knowledge and data, we introduce the adopted methods to achieve different objectives. Finally, we introduce the applications of multi-scale simulation in typical matter systems and social systems.

LGJul 19, 2023
Detecting Vulnerable Nodes in Urban Infrastructure Interdependent Network

Jinzhu Mao, Liu Cao, Chen Gao et al.

Understanding and characterizing the vulnerability of urban infrastructures, which refers to the engineering facilities essential for the regular running of cities and that exist naturally in the form of networks, is of great value to us. Potential applications include protecting fragile facilities and designing robust topologies, etc. Due to the strong correlation between different topological characteristics and infrastructure vulnerability and their complicated evolution mechanisms, some heuristic and machine-assisted analysis fall short in addressing such a scenario. In this paper, we model the interdependent network as a heterogeneous graph and propose a system based on graph neural network with reinforcement learning, which can be trained on real-world data, to characterize the vulnerability of the city system accurately. The presented system leverages deep learning techniques to understand and analyze the heterogeneous graph, which enables us to capture the risk of cascade failure and discover vulnerable infrastructures of cities. Extensive experiments with various requests demonstrate not only the expressive power of our system but also transferring ability and necessity of the specific components.

LGFeb 19, 2024Code
UniST: A Prompt-Empowered Universal Model for Urban Spatio-Temporal Prediction

Yuan Yuan, Jingtao Ding, Jie Feng et al.

Urban spatio-temporal prediction is crucial for informed decision-making, such as traffic management, resource optimization, and emergence response. Despite remarkable breakthroughs in pretrained natural language models that enable one model to handle diverse tasks, a universal solution for spatio-temporal prediction remains challenging Existing prediction approaches are typically tailored for specific spatio-temporal scenarios, requiring task-specific model designs and extensive domain-specific training data. In this study, we introduce UniST, a universal model designed for general urban spatio-temporal prediction across a wide range of scenarios. Inspired by large language models, UniST achieves success through: (i) utilizing diverse spatio-temporal data from different scenarios, (ii) effective pre-training to capture complex spatio-temporal dynamics, (iii) knowledge-guided prompts to enhance generalization capabilities. These designs together unlock the potential of building a universal model for various scenarios Extensive experiments on more than 20 spatio-temporal scenarios demonstrate UniST's efficacy in advancing state-of-the-art performance, especially in few-shot and zero-shot prediction. The datasets and code implementation are released on https://github.com/tsinghua-fib-lab/UniST.

LGFeb 19, 2024Code
Spatio-Temporal Few-Shot Learning via Diffusive Neural Network Generation

Yuan Yuan, Chenyang Shao, Jingtao Ding et al.

Spatio-temporal modeling is foundational for smart city applications, yet it is often hindered by data scarcity in many cities and regions. To bridge this gap, we propose a novel generative pre-training framework, GPD, for spatio-temporal few-shot learning with urban knowledge transfer. Unlike conventional approaches that heavily rely on common feature extraction or intricate few-shot learning designs, our solution takes a novel approach by performing generative pre-training on a collection of neural network parameters optimized with data from source cities. We recast spatio-temporal few-shot learning as pre-training a generative diffusion model, which generates tailored neural networks guided by prompts, allowing for adaptability to diverse data distributions and city-specific characteristics. GPD employs a Transformer-based denoising diffusion model, which is model-agnostic to integrate with powerful spatio-temporal neural networks. By addressing challenges arising from data gaps and the complexity of generalizing knowledge across cities, our framework consistently outperforms state-of-the-art baselines on multiple real-world datasets for tasks such as traffic speed prediction and crowd flow prediction. The implementation of our approach is available: https://github.com/tsinghua-fib-lab/GPD.

LGNov 19, 2024Code
Diffusion Transformers as Open-World Spatiotemporal Foundation Models

Yuan Yuan, Chonghua Han, Jingtao Ding et al.

The urban environment is characterized by complex spatio-temporal dynamics arising from diverse human activities and interactions. Effectively modeling these dynamics is essential for understanding and optimizing urban systems. In this work, we introduce UrbanDiT, a foundation model for open-world urban spatio-temporal learning that successfully scales up diffusion transformers in this field. UrbanDiT pioneers a unified model that integrates diverse data sources and types while learning universal spatio-temporal patterns across different cities and scenarios. This allows the model to unify both multi-data and multi-task learning, and effectively support a wide range of spatio-temporal applications. Its key innovation lies in the elaborated prompt learning framework, which adaptively generates both data-driven and task-specific prompts, guiding the model to deliver superior performance across various urban applications. UrbanDiT offers three advantages: 1) It unifies diverse data types, such as grid-based and graph-based data, into a sequential format; 2) With task-specific prompts, it supports a wide range of tasks, including bi-directional spatio-temporal prediction, temporal interpolation, spatial extrapolation, and spatio-temporal imputation; and 3) It generalizes effectively to open-world scenarios, with its powerful zero-shot capabilities outperforming nearly all baselines with training data. UrbanDiT sets up a new benchmark for foundation models in the urban spatio-temporal domain. Code and datasets are publicly available at https://github.com/tsinghua-fib-lab/UrbanDiT.

LGNov 6, 2024Code
Symbolic regression via MDLformer-guided search: from minimizing prediction error to minimizing description length

Zihan Yu, Jingtao Ding, Yong Li et al. · tsinghua

Symbolic regression, a task discovering the formula best fitting the given data, is typically based on the heuristical search. These methods usually update candidate formulas to obtain new ones with lower prediction errors iteratively. However, since formulas with similar function shapes may have completely different symbolic forms, the prediction error does not decrease monotonously as the search approaches the target formula, causing the low recovery rate of existing methods. To solve this problem, we propose a novel search objective based on the minimum description length, which reflects the distance from the target and decreases monotonically as the search approaches the correct form of the target formula. To estimate the minimum description length of any input data, we design a neural network, MDLformer, which enables robust and scalable estimation through large-scale training. With the MDLformer's output as the search objective, we implement a symbolic regression method, SR4MDL, that can effectively recover the correct mathematical form of the formula. Extensive experiments illustrate its excellent performance in recovering formulas from data. Our method successfully recovers around 50 formulas across two benchmark datasets comprising 133 problems, outperforming state-of-the-art methods by 43.92%. Experiments on 122 unseen black-box problems further demonstrate its generalization performance. We release our code at https://github.com/tsinghua-fib-lab/SR4MDL .

LGNov 20, 2024Code
UniFlow: A Foundation Model for Unified Urban Spatio-Temporal Flow Prediction

Yuan Yuan, Jingtao Ding, Chonghua Han et al.

Urban spatio-temporal flow prediction, encompassing traffic flows and crowd flows, is crucial for optimizing city infrastructure and managing traffic and emergency responses. Traditional approaches have relied on separate models tailored to either grid-based data, representing cities as uniform cells, or graph-based data, modeling cities as networks of nodes and edges. In this paper, we build UniFlow, a foundational model for general urban flow prediction that unifies both grid-based and graphbased data. We first design a multi-view spatio-temporal patching mechanism to standardize different data into a consistent sequential format and then introduce a spatio-temporal transformer architecture to capture complex correlations and dynamics. To leverage shared spatio-temporal patterns across different data types and facilitate effective cross-learning, we propose SpatioTemporal Memory Retrieval Augmentation (ST-MRA). By creating structured memory modules to store shared spatio-temporal patterns, ST-MRA enhances predictions through adaptive memory retrieval. Extensive experiments demonstrate that UniFlow outperforms existing models in both grid-based and graph-based flow prediction, excelling particularly in scenarios with limited data availability, showcasing its superior performance and broad applicability. The datasets and code implementation have been released on https://github.com/YuanYuan98/UniFlow.

CLApr 23, 2025Code
UrbanPlanBench: A Comprehensive Urban Planning Benchmark for Evaluating Large Language Models

Yu Zheng, Longyi Liu, Yuming Lin et al.

The advent of Large Language Models (LLMs) holds promise for revolutionizing various fields traditionally dominated by human expertise. Urban planning, a professional discipline that fundamentally shapes our daily surroundings, is one such field heavily relying on multifaceted domain knowledge and experience of human experts. The extent to which LLMs can assist human practitioners in urban planning remains largely unexplored. In this paper, we introduce a comprehensive benchmark, UrbanPlanBench, tailored to evaluate the efficacy of LLMs in urban planning, which encompasses fundamental principles, professional knowledge, and management and regulations, aligning closely with the qualifications expected of human planners. Through extensive evaluation, we reveal a significant imbalance in the acquisition of planning knowledge among LLMs, with even the most proficient models falling short of meeting professional standards. For instance, we observe that 70% of LLMs achieve subpar performance in understanding planning regulations compared to other aspects. Besides the benchmark, we present the largest-ever supervised fine-tuning (SFT) dataset, UrbanPlanText, comprising over 30,000 instruction pairs sourced from urban planning exams and textbooks. Our findings demonstrate that fine-tuned models exhibit enhanced performance in memorization tests and comprehension of urban planning knowledge, while there exists significant room for improvement, particularly in tasks requiring domain-specific terminology and reasoning. By making our benchmark, dataset, and associated evaluation and fine-tuning toolsets publicly available at https://github.com/tsinghua-fib-lab/PlanBench, we aim to catalyze the integration of LLMs into practical urban planning, fostering a symbiotic collaboration between human expertise and machine intelligence.

AIJan 8
ResMAS: Resilience Optimization in LLM-based Multi-agent Systems

Zhilun Zhou, Zihan Liu, Jiahe Liu et al.

Large Language Model-based Multi-Agent Systems (LLM-based MAS), where multiple LLM agents collaborate to solve complex tasks, have shown impressive performance in many areas. However, MAS are typically distributed across different devices or environments, making them vulnerable to perturbations such as agent failures. While existing works have studied the adversarial attacks and corresponding defense strategies, they mainly focus on reactively detecting and mitigating attacks after they occur rather than proactively designing inherently resilient systems. In this work, we study the resilience of LLM-based MAS under perturbations and find that both the communication topology and prompt design significantly influence system resilience. Motivated by these findings, we propose ResMAS: a two-stage framework for enhancing MAS resilience. First, we train a reward model to predict the MAS's resilience, based on which we train a topology generator to automatically design resilient topology for specific tasks through reinforcement learning. Second, we introduce a topology-aware prompt optimization method that refines each agent's prompt based on its connections and interactions with other agents. Extensive experiments across a range of tasks show that our approach substantially improves MAS resilience under various constraints. Moreover, our framework demonstrates strong generalization ability to new tasks and models, highlighting its potential for building resilient MASs.

AIJun 15, 2024Code
A GPU-accelerated Large-scale Simulator for Transportation System Optimization Benchmarking

Jun Zhang, Wenxuan Ao, Junbo Yan et al.

With the development of artificial intelligence techniques, transportation system optimization is evolving from traditional methods relying on expert experience to simulation and learning-based decision and optimization methods. Learning-based optimization methods require extensive interactions with highly realistic microscopic traffic simulators. However, existing microscopic traffic simulators are inefficient in large-scale scenarios and thus fail to support the adoption of these methods in large-scale transportation system optimization scenarios. In addition, the optimization scenarios supported by existing simulators are limited, mainly focusing on the traffic signal control. To address these challenges, we propose the first open-source GPU-accelerated large-scale microscopic simulator for transportation system simulation and optimization. The simulator can iterate at 84.09Hz, which achieves 88.92 times computational acceleration in the large-scale scenario with 2,464,950 vehicles compared to the best baseline CityFlow. Besides, it achieves a more realistic average road speeds simulated on real datasets by adopting the IDM model as the car-following model and the randomized MOBIL model as the lane-changing model. Based on it, we implement a set of microscopic and macroscopic controllable objects and metrics provided by Python API to support typical transportation system optimization scenarios. We choose five representative scenarios and benchmark classical rule-based algorithms, reinforcement learning algorithms, and black-box optimization algorithms in four cities. These experiments effectively demonstrate the usability of the simulator for large-scale traffic system optimization. The code of the simulator is available at https://github.com/tsinghua-fib-lab/moss. We build an open-registration web platform available at https://moss.fiblab.net to support no-code trials.

AIMay 22, 2023Code
Road Planning for Slums via Deep Reinforcement Learning

Yu Zheng, Hongyuan Su, Jingtao Ding et al.

Millions of slum dwellers suffer from poor accessibility to urban services due to inadequate road infrastructure within slums, and road planning for slums is critical to the sustainable development of cities. Existing re-blocking or heuristic methods are either time-consuming which cannot generalize to different slums, or yield sub-optimal road plans in terms of accessibility and construction costs. In this paper, we present a deep reinforcement learning based approach to automatically layout roads for slums. We propose a generic graph model to capture the topological structure of a slum, and devise a novel graph neural network to select locations for the planned roads. Through masked policy optimization, our model can generate road plans that connect places in a slum at minimal construction costs. Extensive experiments on real-world slums in different countries verify the effectiveness of our model, which can significantly improve accessibility by 14.3% against existing baseline methods. Further investigations on transferring across different tasks demonstrate that our model can master road planning skills in simple scenarios and adapt them to much more complicated ones, indicating the potential of applying our model in real-world slum upgrading. The code and data are available at https://github.com/tsinghua-fib-lab/road-planning-for-slums.

LGMay 21, 2023Code
Spatio-temporal Diffusion Point Processes

Yuan Yuan, Jingtao Ding, Chenyang Shao et al.

Spatio-temporal point process (STPP) is a stochastic collection of events accompanied with time and space. Due to computational complexities, existing solutions for STPPs compromise with conditional independence between time and space, which consider the temporal and spatial distributions separately. The failure to model the joint distribution leads to limited capacities in characterizing the spatio-temporal entangled interactions given past events. In this work, we propose a novel parameterization framework for STPPs, which leverages diffusion models to learn complex spatio-temporal joint distributions. We decompose the learning of the target joint distribution into multiple steps, where each step can be faithfully described by a Gaussian distribution. To enhance the learning of each step, an elaborated spatio-temporal co-attention module is proposed to capture the interdependence between the event time and space adaptively. For the first time, we break the restrictions on spatio-temporal dependencies in existing solutions, and enable a flexible and accurate modeling paradigm for STPPs. Extensive experiments from a wide range of fields, such as epidemiology, seismology, crime, and urban mobility, demonstrate that our framework outperforms the state-of-the-art baselines remarkably, with an average improvement of over 50%. Further in-depth analyses validate its ability to capture spatio-temporal interactions, which can learn adaptively for different scenarios. The datasets and source code are available online: https://github.com/tsinghua-fib-lab/Spatio-temporal-Diffusion-Point-Processes.

IRFeb 26, 2022Code
Disentangling Long and Short-Term Interests for Recommendation

Yu Zheng, Chen Gao, Jianxin Chang et al.

Modeling user's long-term and short-term interests is crucial for accurate recommendation. However, since there is no manually annotated label for user interests, existing approaches always follow the paradigm of entangling these two aspects, which may lead to inferior recommendation accuracy and interpretability. In this paper, to address it, we propose a Contrastive learning framework to disentangle Long and Short-term interests for Recommendation (CLSR) with self-supervision. Specifically, we first propose two separate encoders to independently capture user interests of different time scales. We then extract long-term and short-term interests proxies from the interaction sequences, which serve as pseudo labels for user interests. Then pairwise contrastive tasks are designed to supervise the similarity between interest representations and their corresponding interest proxies. Finally, since the importance of long-term and short-term interests is dynamically changing, we propose to adaptively aggregate them through an attention-based network for prediction. We conduct experiments on two large-scale real-world datasets for e-commerce and short-video recommendation. Empirical results show that our CLSR consistently outperforms all state-of-the-art models with significant improvements: GAUC is improved by over 0.01, and NDCG is improved by over 4%. Further counterfactual evaluations demonstrate that stronger disentanglement of long and short-term interests is successfully achieved by CLSR. The code and data are available at https://github.com/tsinghua-fib-lab/CLSR.

IRNov 5, 2021Code
Inhomogeneous Social Recommendation with Hypergraph Convolutional Networks

Zirui Zhu, Chen Gao, Xu Chen et al.

Incorporating social relations into the recommendation system, i.e. social recommendation, has been widely studied in academic and industrial communities. While many promising results have been achieved, existing methods mostly assume that the social relations can be homogeneously applied to all the items, which is not practical for users' actually diverse preferences. In this paper, we argue that the effect of the social relations should be inhomogeneous, that is, two socially-related users may only share the same preference on some specific items, while for the other products, their preferences can be inconsistent or even contradictory. Inspired by this idea, we build a novel social recommendation model, where the traditional pair-wise "user-user'' relation is extended to the triple relation of "user-item-user''. To well handle such high-order relations, we base our framework on the hypergraph. More specifically, each hyperedge connects a user-user-item triplet, representing that the two users share similar preferences on the item. We develop a Social HyperGraph Convolutional Network (short for SHGCN) to learn from the complex triplet social relations. With the hypergraph convolutional networks, the social relations can be modeled in a more fine-grained manner, which more accurately depicts real users' preferences, and benefits the recommendation performance. Extensive experiments on two real-world datasets demonstrate our model's effectiveness. Studies on data sparsity and hyper-parameter studies further validate our model's rationality. Our codes and dataset are available at https://github.com/ziruizhu/SHGCN.

IRSep 27, 2021Code
A Survey of Graph Neural Networks for Recommender Systems: Challenges, Methods, and Directions

Chen Gao, Yu Zheng, Nian Li et al.

Recommender system is one of the most important information services on today's Internet. Recently, graph neural networks have become the new state-of-the-art approach to recommender systems. In this survey, we conduct a comprehensive review of the literature on graph neural network-based recommender systems. We first introduce the background and the history of the development of both recommender systems and graph neural networks. For recommender systems, in general, there are four aspects for categorizing existing works: stage, scenario, objective, and application. For graph neural networks, the existing methods consist of two categories, spectral models and spatial ones. We then discuss the motivation of applying graph neural networks into recommender systems, mainly consisting of the high-order connectivity, the structural property of data, and the enhanced supervision signal. We then systematically analyze the challenges in graph construction, embedding propagation/aggregation, model optimization, and computation efficiency. Afterward and primarily, we provide a comprehensive overview of a multitude of existing works of graph neural network-based recommender systems, following the taxonomy above. Finally, we raise discussions on the open problems and promising future directions in this area. We summarize the representative papers along with their code repositories in \url{https://github.com/tsinghua-fib-lab/GNN-Recommender-Systems}.

LGJan 19, 2021Code
Learnable Embedding Sizes for Recommender Systems

Siyi Liu, Chen Gao, Yihong Chen et al.

The embedding-based representation learning is commonly used in deep learning recommendation models to map the raw sparse features to dense vectors. The traditional embedding manner that assigns a uniform size to all features has two issues. First, the numerous features inevitably lead to a gigantic embedding table that causes a high memory usage cost. Second, it is likely to cause the over-fitting problem for those features that do not require too large representation capacity. Existing works that try to address the problem always cause a significant drop in recommendation performance or suffers from the limitation of unaffordable training time cost. In this paper, we proposed a novel approach, named PEP (short for Plug-in Embedding Pruning), to reduce the size of the embedding table while avoiding the drop of recommendation accuracy. PEP prunes embedding parameter where the pruning threshold(s) can be adaptively learned from data. Therefore we can automatically obtain a mixed-dimension embedding-scheme by pruning redundant parameters for each feature. PEP is a general framework that can plug in various base recommendation models. Extensive experiments demonstrate it can efficiently cut down embedding parameters and boost the base model's performance. Specifically, it achieves strong recommendation performance while reducing 97-99% parameters. As for the computation cost, PEP only brings an additional 20-30% time cost compared with base models. Codes are available at https://github.com/ssui-liu/learnable-embed-sizes-for-RecSys.

IROct 14, 2020Code
Group-Buying Recommendation for Social E-Commerce

Jun Zhang, Chen Gao, Depeng Jin et al.

Group buying, as an emerging form of purchase in social e-commerce websites, such as Pinduoduo, has recently achieved great success. In this new business model, users, initiator, can launch a group and share products to their social networks, and when there are enough friends, participants, join it, the deal is clinched. Group-buying recommendation for social e-commerce, which recommends an item list when users want to launch a group, plays an important role in the group success ratio and sales. However, designing a personalized recommendation model for group buying is an entirely new problem that is seldom explored. In this work, we take the first step to approach the problem of group-buying recommendation for social e-commerce and develop a GBGCN method (short for Group-Buying Graph Convolutional Network). Considering there are multiple types of behaviors (launch and join) and structured social network data, we first propose to construct directed heterogeneous graphs to represent behavioral data and social networks. We then develop a graph convolutional network model with multi-view embedding propagation, which can extract the complicated high-order graph structure to learn the embeddings. Last, since a failed group-buying implies rich preferences of the initiator and participants, we design a double-pairwise loss function to distill such preference signals. We collect a real-world dataset of group-buying and conduct experiments to evaluate the performance. Empirical results demonstrate that our proposed GBGCN can significantly outperform baseline methods by 2.69%-7.36%. The codes and the dataset are released at https://github.com/Sweetnow/group-buying-recommendation.

AIFeb 27, 2024
Large Language Model for Participatory Urban Planning

Zhilun Zhou, Yuming Lin, Depeng Jin et al.

Participatory urban planning is the mainstream of modern urban planning that involves the active engagement of residents. However, the traditional participatory paradigm requires experienced planning experts and is often time-consuming and costly. Fortunately, the emerging Large Language Models (LLMs) have shown considerable ability to simulate human-like agents, which can be used to emulate the participatory process easily. In this work, we introduce an LLM-based multi-agent collaboration framework for participatory urban planning, which can generate land-use plans for urban regions considering the diverse needs of residents. Specifically, we construct LLM agents to simulate a planner and thousands of residents with diverse profiles and backgrounds. We first ask the planner to carry out an initial land-use plan. To deal with the different facilities needs of residents, we initiate a discussion among the residents in each community about the plan, where residents provide feedback based on their profiles. Furthermore, to improve the efficiency of discussion, we adopt a fishbowl discussion mechanism, where part of the residents discuss and the rest of them act as listeners in each round. Finally, we let the planner modify the plan based on residents' feedback. We deploy our method on two real-world regions in Beijing. Experiments show that our method achieves state-of-the-art performance in residents satisfaction and inclusion metrics, and also outperforms human experts in terms of service accessibility and ecology metrics.

IRNov 4, 2024
Enhancing ID-based Recommendation with Large Language Models

Lei Chen, Chen Gao, Xiaoyi Du et al.

Large Language Models (LLMs) have recently garnered significant attention in various domains, including recommendation systems. Recent research leverages the capabilities of LLMs to improve the performance and user modeling aspects of recommender systems. These studies primarily focus on utilizing LLMs to interpret textual data in recommendation tasks. However, it's worth noting that in ID-based recommendations, textual data is absent, and only ID data is available. The untapped potential of LLMs for ID data within the ID-based recommendation paradigm remains relatively unexplored. To this end, we introduce a pioneering approach called "LLM for ID-based Recommendation" (LLM4IDRec). This innovative approach integrates the capabilities of LLMs while exclusively relying on ID data, thus diverging from the previous reliance on textual data. The basic idea of LLM4IDRec is that by employing LLM to augment ID data, if augmented ID data can improve recommendation performance, it demonstrates the ability of LLM to interpret ID data effectively, exploring an innovative way for the integration of LLM in ID-based recommendation. We evaluate the effectiveness of our LLM4IDRec approach using three widely-used datasets. Our results demonstrate a notable improvement in recommendation performance, with our approach consistently outperforming existing methods in ID-based recommendation by solely augmenting input data.

CLOct 29, 2024
Synergizing LLM Agents and Knowledge Graph for Socioeconomic Prediction in LBSN

Zhilun Zhou, Jingyang Fan, Yu Liu et al.

The fast development of location-based social networks (LBSNs) has led to significant changes in society, resulting in popular studies of using LBSN data for socioeconomic prediction, e.g., regional population and commercial activity estimation. Existing studies design various graphs to model heterogeneous LBSN data, and further apply graph representation learning methods for socioeconomic prediction. However, these approaches heavily rely on heuristic ideas and expertise to extract task-relevant knowledge from diverse data, which may not be optimal for specific tasks. Additionally, they tend to overlook the inherent relationships between different indicators, limiting the prediction accuracy. Motivated by the remarkable abilities of large language models (LLMs) in commonsense reasoning, embedding, and multi-agent collaboration, in this work, we synergize LLM agents and knowledge graph for socioeconomic prediction. We first construct a location-based knowledge graph (LBKG) to integrate multi-sourced LBSN data. Then we leverage the reasoning power of LLM agent to identify relevant meta-paths in the LBKG for each type of socioeconomic prediction task, and design a semantic-guided attention module for knowledge fusion with meta-paths. Moreover, we introduce a cross-task communication mechanism to further enhance performance by enabling knowledge sharing across tasks at both LLM agent and KG levels. On the one hand, the LLM agents for different tasks collaborate to generate more diverse and comprehensive meta-paths. On the other hand, the embeddings from different tasks are adaptively merged for better socioeconomic prediction. Experiments on two datasets demonstrate the effectiveness of the synergistic design between LLM and KG, providing insights for information sharing across socioeconomic prediction tasks.

CVNov 22, 2025
RoadBench: Benchmarking MLLMs on Fine-Grained Spatial Understanding and Reasoning under Urban Road Scenarios

Jun Zhang, Jie Feng, Long Chen et al.

Multimodal large language models (MLLMs) have demonstrated powerful capabilities in general spatial understanding and reasoning. However, their fine-grained spatial understanding and reasoning capabilities in complex urban scenarios have not received significant attention in the fields of both research and industry. To fill this gap, we focus primarily on road markings as a typical example of fine-grained spatial elements under urban scenarios, given the essential role of the integrated road traffic network they form within cities. Around road markings and urban traffic systems, we propose RoadBench, a systematic benchmark that comprehensively evaluates MLLMs' fine-grained spatial understanding and reasoning capabilities using BEV and FPV image inputs. This benchmark comprises six tasks consisting of 9,121 strictly manually verified test cases. These tasks form a systematic evaluation framework that bridges understanding at local spatial scopes to global reasoning. They not only test MLLMs' capabilities in recognition, joint understanding, and reasoning but also assess their ability to integrate image information with domain knowledge. After evaluating 14 mainstream MLLMs, we confirm that RoadBench is a challenging benchmark for MLLMs while revealing significant shortcomings in existing MLLMs' fine-grained spatial understanding and reasoning capabilities within urban scenarios. In certain tasks, their performance even falls short of simple rule-based or random selection baselines. These findings, along with RoadBench itself, will contribute to the comprehensive advancement of spatial understanding capabilities for MLLMs. The benchmark code, example datasets, and raw evaluation results are available in the supplementary material.

SOC-PHJul 4, 2025
Advancing network resilience theories with symbolized reinforcement learning

Yu Zheng, Jingtao Ding, Depeng Jin et al.

Many complex networks display remarkable resilience under external perturbations, internal failures and environmental changes, yet they can swiftly deteriorate into dysfunction upon the removal of a few keystone nodes. Discovering theories that measure network resilience offers the potential to prevent catastrophic collapses--from species extinctions to financial crise--with profound implications for real-world systems. Current resilience theories address the problem from a single perspective of topology, neglecting the crucial role of system dynamics, due to the intrinsic complexity of the coupling between topology and dynamics which exceeds the capabilities of human analytical methods. Here, we report an automatic method for resilience theory discovery, which learns from how AI solves a complicated network dismantling problem and symbolizes its network attack strategies into theoretical formulas. This proposed self-inductive approach discovers the first resilience theory that accounts for both topology and dynamics, highlighting how the correlation between node degree and state shapes overall network resilience, and offering insights for designing early warning signals of systematic collapses. Additionally, our approach discovers formulas that refine existing well-established resilience theories with over 37.5% improvement in accuracy, significantly advancing human understanding of complex networks with AI.

DCDec 15, 2021
LoSAC: An Efficient Local Stochastic Average Control Method for Federated Optimization

Huiming Chen, Huandong Wang, Quanming Yao et al.

Federated optimization (FedOpt), which targets at collaboratively training a learning model across a large number of distributed clients, is vital for federated learning. The primary concerns in FedOpt can be attributed to the model divergence and communication efficiency, which significantly affect the performance. In this paper, we propose a new method, i.e., LoSAC, to learn from heterogeneous distributed data more efficiently. Its key algorithmic insight is to locally update the estimate for the global full gradient after {each} regular local model update. Thus, LoSAC can keep clients' information refreshed in a more compact way. In particular, we have studied the convergence result for LoSAC. Besides, the bonus of LoSAC is the ability to defend the information leakage from the recent technique Deep Leakage Gradients (DLG). Finally, experiments have verified the superiority of LoSAC comparing with state-of-the-art FedOpt algorithms. Specifically, LoSAC significantly improves communication efficiency by more than $100\%$ on average, mitigates the model divergence problem and equips with the defense ability against DLG.

IRNov 1, 2021
Improving Location Recommendation with Urban Knowledge Graph

Chang Liu, Chen Gao, Depeng Jin et al.

Location recommendation is defined as to recommend locations (POIs) to users in location-based services. The existing data-driving approaches of location recommendation suffer from the limitation of the implicit modeling of the geographical factor, which may lead to sub-optimal recommendation results. In this work, we address this problem by introducing knowledge-driven solutions. Specifically, we first construct the Urban Knowledge Graph (UrbanKG) with geographical information and functional information of POIs. On the other side, there exist a fact that the geographical factor not only characterizes POIs but also affects user-POI interactions. To address it, we propose a novel method named UKGC. We first conduct information propagation on two sub-graphs to learn the representations of POIs and users. We then fuse two parts of representations by counterfactual learning for the final prediction. Extensive experiments on two real-world datasets verify that our method can outperform the state-of-the-art methods.

SINov 1, 2021
Spatio-Temporal Urban Knowledge Graph Enabled Mobility Prediction

Huandong Wang, Qiaohong Yu, Yu Liu et al.

With the rapid development of the mobile communication technology, mobile trajectories of humans are massively collected by Internet service providers (ISPs) and application service providers (ASPs). On the other hand, the rising paradigm of knowledge graph (KG) provides us a promising solution to extract structured "knowledge" from massive trajectory data. In this paper, we focus on modeling users' spatio-temporal mobility patterns based on knowledge graph techniques, and predicting users' future movement based on the "knowledge'' extracted from multiple sources in a cohesive manner. Specifically, we propose a new type of knowledge graph, i.e., spatio-temporal urban knowledge graph (STKG), where mobility trajectories, category information of venues, and temporal information are jointly modeled by the facts with different relation types in STKG. The mobility prediction problem is converted to the knowledge graph completion problem in STKG. Further, a complex embedding model with elaborately designed scoring functions is proposed to measure the plausibility of facts in STKG to solve the knowledge graph completion problem, which considers temporal dynamics of the mobility patterns and utilizes PoI categories as the auxiliary information and background knowledge. Extensive evaluations confirm the high accuracy of our model in predicting users' mobility, i.e., improving the accuracy by 5.04% compared with the state-of-the-art algorithms. In addition, PoI categories as the background knowledge and auxiliary information are confirmed to be helpful by improving the performance by 3.85% in terms of accuracy. Additionally, experiments show that our proposed method is time-efficient by reducing the computational time by over 43.12% compared with existing methods.

IRAug 16, 2021
DGCN: Diversified Recommendation with Graph Convolutional Networks

Yu Zheng, Chen Gao, Liang Chen et al.

These years much effort has been devoted to improving the accuracy or relevance of the recommendation system. Diversity, a crucial factor which measures the dissimilarity among the recommended items, received rather little scrutiny. Directly related to user satisfaction, diversification is usually taken into consideration after generating the candidate items. However, this decoupled design of diversification and candidate generation makes the whole system suboptimal. In this paper, we aim at pushing the diversification to the upstream candidate generation stage, with the help of Graph Convolutional Networks (GCN). Although GCN based recommendation algorithms have shown great power in modeling complex collaborative filtering effect to improve the accuracy of recommendation, how diversity changes is ignored in those advanced works. We propose to perform rebalanced neighbor discovering, category-boosted negative sampling and adversarial learning on top of GCN. We conduct extensive experiments on real-world datasets. Experimental results verify the effectiveness of our proposed method on diversification. Further ablation studies validate that our proposed method significantly alleviates the accuracy-diversity dilemma.

LGAug 10, 2021
Deep Reinforcement Learning for Demand Driven Services in Logistics and Transportation Systems: A Survey

Zefang Zong, Jingwei Wang, Tao Feng et al.

Recent technology development brings the boom of numerous new Demand-Driven Services (DDS) into urban lives, including ridesharing, on-demand delivery, express systems and warehousing. In DDS, a service loop is an elemental structure, including its service worker, the service providers and corresponding service targets. The service workers should transport either people or parcels from the providers to the target locations. Various planning tasks within DDS can thus be classified into two individual stages: 1) Dispatching, which is to form service loops from demand/supply distributions, and 2) Routing, which is to decide specific serving orders within the constructed loops. Generating high-quality strategies in both stages is important to develop DDS but faces several challenges. Meanwhile, deep reinforcement learning (DRL) has been developed rapidly in recent years. It is a powerful tool to solve these problems since DRL can learn a parametric model without relying on too many problem-based assumptions and optimize long-term effects by learning sequential decisions. In this survey, we first define DDS, then highlight common applications and important decision/control problems within. For each problem, we comprehensively introduce the existing DRL solutions. We also introduce open simulation environments for development and evaluation of DDS applications. Finally, we analyze remaining challenges and discuss further research opportunities in DRL solutions for DDS.

IRJun 27, 2021
Sequential Recommendation with Graph Neural Networks

Jianxin Chang, Chen Gao, Yu Zheng et al.

Sequential recommendation aims to leverage users' historical behaviors to predict their next interaction. Existing works have not yet addressed two main challenges in sequential recommendation. First, user behaviors in their rich historical sequences are often implicit and noisy preference signals, they cannot sufficiently reflect users' actual preferences. In addition, users' dynamic preferences often change rapidly over time, and hence it is difficult to capture user patterns in their historical sequences. In this work, we propose a graph neural network model called SURGE (short for SeqUential Recommendation with Graph neural nEtworks) to address these two issues. Specifically, SURGE integrates different types of preferences in long-term user behaviors into clusters in the graph by re-constructing loose item sequences into tight item-item interest graphs based on metric learning. This helps explicitly distinguish users' core interests, by forming dense clusters in the interest graph. Then, we perform cluster-aware and query-aware graph convolutional propagation and graph pooling on the constructed graph. It dynamically fuses and extracts users' current activated core interests from noisy user behavior sequences. We conduct extensive experiments on both public and proprietary industrial datasets. Experimental results demonstrate significant performance gains of our proposed method compared to state-of-the-art methods. Further studies on sequence length confirm that our method can model long behavioral sequences effectively and efficiently.

IRJun 14, 2021
Efficient Data-specific Model Search for Collaborative Filtering

Chen Gao, Quanming Yao, Depeng Jin et al.

Collaborative filtering (CF), as a fundamental approach for recommender systems, is usually built on the latent factor model with learnable parameters to predict users' preferences towards items. However, designing a proper CF model for a given data is not easy, since the properties of datasets are highly diverse. In this paper, motivated by the recent advances in automated machine learning (AutoML), we propose to design a data-specific CF model by AutoML techniques. The key here is a new framework that unifies state-of-the-art (SOTA) CF methods and splits them into disjoint stages of input encoding, embedding function, interaction function, and prediction function. We further develop an easy-to-use, robust, and efficient search strategy, which utilizes random search and a performance predictor for efficient searching within the above framework. In this way, we can combinatorially generalize data-specific CF models, which have not been visited in the literature, from SOTA ones. Extensive experiments on five real-world datasets demonstrate that our method can consistently outperform SOTA ones for various CF tasks. Further experiments verify the rationality of the proposed framework and the efficiency of the search strategy. The searched CF models can also provide insights for exploring more effective methods in the future

LGApr 30, 2021
Dynamic Graph Convolutional Recurrent Network for Traffic Prediction: Benchmark and Solution

Fuxian Li, Jie Feng, Huan Yan et al.

Traffic prediction is the cornerstone of an intelligent transportation system. Accurate traffic forecasting is essential for the applications of smart cities, i.e., intelligent traffic management and urban planning. Although various methods are proposed for spatio-temporal modeling, they ignore the dynamic characteristics of correlations among locations on road networks. Meanwhile, most Recurrent Neural Network (RNN) based works are not efficient enough due to their recurrent operations. Additionally, there is a severe lack of fair comparison among different methods on the same datasets. To address the above challenges, in this paper, we propose a novel traffic prediction framework, named Dynamic Graph Convolutional Recurrent Network (DGCRN). In DGCRN, hyper-networks are designed to leverage and extract dynamic characteristics from node attributes, while the parameters of dynamic filters are generated at each time step. We filter the node embeddings and then use them to generate a dynamic graph, which is integrated with a pre-defined static graph. As far as we know, we are the first to employ a generation method to model fine topology of dynamic graph at each time step. Further, to enhance efficiency and performance, we employ a training strategy for DGCRN by restricting the iteration number of decoder during forward and backward propagation. Finally, a reproducible standardized benchmark and a brand new representative traffic dataset are opened for fair comparison and further research. Extensive experiments on three datasets demonstrate that our model outperforms 15 baselines consistently.

LGSep 7, 2020
Simplify and Robustify Negative Sampling for Implicit Collaborative Filtering

Jingtao Ding, Yuhan Quan, Quanming Yao et al.

Negative sampling approaches are prevalent in implicit collaborative filtering for obtaining negative labels from massive unlabeled data. As two major concerns in negative sampling, efficiency and effectiveness are still not fully achieved by recent works that use complicate structures and overlook risk of false negative instances. In this paper, we first provide a novel understanding of negative instances by empirically observing that only a few instances are potentially important for model learning, and false negatives tend to have stable predictions over many training iterations. Above findings motivate us to simplify the model by sampling from designed memory that only stores a few important candidates and, more importantly, tackle the untouched false negative problem by favouring high-variance samples stored in memory, which achieves efficient sampling of true negatives with high-quality. Empirical results on two synthetic datasets and three real-world datasets demonstrate both robustness and superiorities of our negative sampling method.

IRJun 19, 2020
Disentangling User Interest and Conformity for Recommendation with Causal Embedding

Yu Zheng, Chen Gao, Xiang Li et al.

Recommendation models are usually trained on observational interaction data. However, observational interaction data could result from users' conformity towards popular items, which entangles users' real interest. Existing methods tracks this problem as eliminating popularity bias, e.g., by re-weighting training samples or leveraging a small fraction of unbiased data. However, the variety of user conformity is ignored by these approaches, and different causes of an interaction are bundled together as unified representations, hence robustness and interpretability are not guaranteed when underlying causes are changing. In this paper, we present DICE, a general framework that learns representations where interest and conformity are structurally disentangled, and various backbone recommendation models could be smoothly integrated. We assign users and items with separate embeddings for interest and conformity, and make each embedding capture only one cause by training with cause-specific data which is obtained according to the colliding effect of causal inference. Our proposed methodology outperforms state-of-the-art baselines with remarkable improvements on two real-world datasets on top of various backbone models. We further demonstrate that the learned embeddings successfully capture the desired causes, and show that DICE guarantees the robustness and interpretability of recommendation.

IRMay 7, 2020
Bundle Recommendation with Graph Convolutional Networks

Jianxin Chang, Chen Gao, Xiangnan He et al.

Bundle recommendation aims to recommend a bundle of items for a user to consume as a whole. Existing solutions integrate user-item interaction modeling into bundle recommendation by sharing model parameters or learning in a multi-task manner, which cannot explicitly model the affiliation between items and bundles, and fail to explore the decision-making when a user chooses bundles. In this work, we propose a graph neural network model named BGCN (short for \textit{\textBF{B}undle \textBF{G}raph \textBF{C}onvolutional \textBF{N}etwork}) for bundle recommendation. BGCN unifies user-item interaction, user-bundle interaction and bundle-item affiliation into a heterogeneous graph. With item nodes as the bridge, graph convolutional propagation between user and bundle nodes makes the learned representations capture the item level semantics. Through training based on hard-negative sampler, the user's fine-grained preferences for similar bundles are further distinguished. Empirical results on two real-world datasets demonstrate the strong performance gains of BGCN, which outperforms the state-of-the-art baselines by 10.77\% to 23.18\%.

IRMar 9, 2020
Price-aware Recommendation with Graph Convolutional Networks

Yu Zheng, Chen Gao, Xiangnan He et al.

In recent years, much research effort on recommendation has been devoted to mining user behaviors, i.e., collaborative filtering, along with the general information which describes users or items, e.g., textual attributes, categorical demographics, product images, and so on. Price, an important factor in marketing --- which determines whether a user will make the final purchase decision on an item --- surprisingly, has received relatively little scrutiny. In this work, we aim at developing an effective method to predict user purchase intention with the focus on the price factor in recommender systems. The main difficulties are two-fold: 1) the preference and sensitivity of a user on item price are unknown, which are only implicitly reflected in the items that the user has purchased, and 2) how the item price affects a user's intention depends largely on the product category, that is, the perception and affordability of a user on item price could vary significantly across categories. Towards the first difficulty, we propose to model the transitive relationship between user-to-item and item-to-price, taking the inspiration from the recently developed Graph Convolution Networks (GCN). The key idea is to propagate the influence of price on users with items as the bridge, so as to make the learned user representations be price-aware. For the second difficulty, we further integrate item categories into the propagation progress and model the possible pairwise interactions for predicting user-item interactions. We conduct extensive experiments on two real-world datasets, demonstrating the effectiveness of our GCN-based method in learning the price-aware preference of users. Further analysis reveals that modeling the price awareness is particularly useful for predicting user preference on items of unexplored categories.

SINov 3, 2019
UrbanRhythm: Revealing Urban Dynamics Hidden in Mobility Data

Sirui Song, Tong Xia, Depeng Jin et al.

Understanding urban dynamics, i.e., how the types and intensity of urban residents' activities in the city change along with time, is of urgent demand for building an efficient and livable city. Nonetheless, this is challenging due to the expanding urban population and the complicated spatial distribution of residents. In this paper, to reveal urban dynamics, we propose a novel system UrbanRhythm to reveal the urban dynamics hidden in human mobility data. UrbanRhythm addresses three questions: 1) What mobility feature should be used to present residents' high-dimensional activities in the city? 2) What are basic components of urban dynamics? 3) What are the long-term periodicity and short-term regularity of urban dynamics? In UrbanRhythm, we extract staying, leaving, arriving three attributes of mobility and use a image processing method Saak transform to calculate the mobility distribution feature. For the second question, several city states are identified by hierarchy clustering as the basic components of urban dynamics, such as sleeping states and working states. We further characterize the urban dynamics as the transform of city states along time axis. For the third question, we directly observe the long-term periodicity of urban dynamics from visualization. Then for the short-term regularity, we design a novel motif analysis method to discovery motifs as well as their hierarchy relationships. We evaluate our proposed system on two real-life datesets and validate the results according to App usage records. This study sheds light on urban dynamics hidden in human mobility and can further pave the way for more complicated mobility behavior modeling and deeper urban understanding.

IRSep 21, 2018
Sampler Design for Bayesian Personalized Ranking by Leveraging View Data

Jingtao Ding, Guanghui Yu, Xiangnan He et al.

Bayesian Personalized Ranking (BPR) is a representative pairwise learning method for optimizing recommendation models. It is widely known that the performance of BPR depends largely on the quality of negative sampler. In this paper, we make two contributions with respect to BPR. First, we find that sampling negative items from the whole space is unnecessary and may even degrade the performance. Second, focusing on the purchase feedback of E-commerce, we propose an effective sampler for BPR by leveraging the additional view data. In our proposed sampler, users' viewed interactions are considered as an intermediate feedback between those purchased and unobserved interactions. The pairwise rankings of user preference among these three types of interactions are jointly learned, and a user-oriented weighting strategy is considered during learning process, which is more effective and flexible. Compared to the vanilla BPR that applies a uniform sampler on all candidates, our view-enhanced sampler enhances BPR with a relative improvement over 37.03% and 16.40% on two real-world datasets. Our study demonstrates the importance of considering users' additional feedback when modeling their preference on different items, which avoids sampling negative items indiscriminately and inefficiently.

IRSep 21, 2018
Learning to Recommend with Multiple Cascading Behaviors

Chen Gao, Xiangnan He, Dahua Gan et al.

Most existing recommender systems leverage user behavior data of one type only, such as the purchase behavior in E-commerce that is directly related to the business KPI (Key Performance Indicator) of conversion rate. Besides the key behavioral data, we argue that other forms of user behaviors also provide valuable signal, such as views, clicks, adding a product to shop carts and so on. They should be taken into account properly to provide quality recommendation for users. In this work, we contribute a new solution named NMTR (short for Neural Multi-Task Recommendation) for learning recommender systems from user multi-behavior data. We develop a neural network model to capture the complicated and multi-type interactions between users and items. In particular, our model accounts for the cascading relationship among different types of behaviors (e.g., a user must click on a product before purchasing it). To fully exploit the signal in the data of multiple types of behaviors, we perform a joint optimization based on the multi-task learning framework, where the optimization on a behavior is treated as a task. Extensive experiments on two real-world datasets demonstrate that NMTR significantly outperforms state-of-the-art recommender systems that are designed to learn from both single-behavior data and multi-behavior data. Further analysis shows that modeling multiple behaviors is particularly useful for providing recommendation for sparse users that have very few interactions.

CYFeb 21, 2017
Trajectory Recovery From Ash: User Privacy Is NOT Preserved in Aggregated Mobility Data

Fengli Xu, Zhen Tu, Yong Li et al.

Human mobility data has been ubiquitously collected through cellular networks and mobile applications, and publicly released for academic research and commercial purposes for the last decade. Since releasing individual's mobility records usually gives rise to privacy issues, datasets owners tend to only publish aggregated mobility data, such as the number of users covered by a cellular tower at a specific timestamp, which is believed to be sufficient for preserving users' privacy. However, in this paper, we argue and prove that even publishing aggregated mobility data could lead to privacy breach in individuals' trajectories. We develop an attack system that is able to exploit the uniqueness and regularity of human mobility to recover individual's trajectories from the aggregated mobility data without any prior knowledge. By conducting experiments on two real-world datasets collected from both mobile application and cellular network, we reveal that the attack system is able to recover users' trajectories with accuracy about 73%~91% at the scale of tens of thousands to hundreds of thousands users, which indicates severe privacy leakage in such datasets. Through the investigation on aggregated mobility data, our work recognizes a novel privacy problem in publishing statistic data, which appeals for immediate attentions from both academy and industry.