AIJan 7, 2021

Reinforced Imitative Graph Representation Learning for Mobile User Profiling: An Adversarial Training Perspective

Dongjie Wang, Pengyang Wang, Kunpeng Liu, Yuanchun Zhou, Charles Hughes, Yanjie Fu

arXiv:2101.02634v118.331 citations

Originality Incremental advance

AI Analysis

This work aims to improve the accuracy and stability of mobile user profiling for human mobility modeling, which is an incremental improvement for researchers and practitioners in urban computing and location-based services.

This paper addresses mobile user profiling, a key component in human mobility modeling, by proposing an imitation-based framework using reinforcement learning. The framework trains an agent to imitate user mobility patterns to derive optimal user profiles, addressing challenges of unstable profiles due to exploration-exploitation trade-off and temporal integration of user characteristics.

In this paper, we study the problem of mobile user profiling, which is a critical component for quantifying users' characteristics in the human mobility modeling pipeline. Human mobility is a sequential decision-making process dependent on the users' dynamic interests. With accurate user profiles, the predictive model can perfectly reproduce users' mobility trajectories. In the reverse direction, once the predictive model can imitate users' mobility patterns, the learned user profiles are also optimal. Such intuition motivates us to propose an imitation-based mobile user profiling framework by exploiting reinforcement learning, in which the agent is trained to precisely imitate users' mobility patterns for optimal user profiles. Specifically, the proposed framework includes two modules: (1) representation module, which produces state combining user profiles and spatio-temporal context in real-time; (2) imitation module, where Deep Q-network (DQN) imitates the user behavior (action) based on the state that is produced by the representation module. However, there are two challenges in running the framework effectively. First, epsilon-greedy strategy in DQN makes use of the exploration-exploitation trade-off by randomly pick actions with the epsilon probability. Such randomness feeds back to the representation module, causing the learned user profiles unstable. To solve the problem, we propose an adversarial training strategy to guarantee the robustness of the representation module. Second, the representation module updates users' profiles in an incremental manner, requiring integrating the temporal effects of user profiles. Inspired by Long-short Term Memory (LSTM), we introduce a gated mechanism to incorporate new and old user characteristics into the user profile.

View on arXiv PDF

Similar