LGMay 11, 2024

RoTHP: Rotary Position Embedding-based Transformer Hawkes Process

arXiv:2405.06985v14 citationsh-index: 2
Originality Incremental advance
AI Analysis

This work addresses incremental improvements in neural temporal point processes for modeling event sequences in domains like finance and social networks.

The paper tackles the sequence prediction issue and sensitivity to temporal changes in Transformer Hawkes Processes by proposing RoTHP, which incorporates rotary position embeddings and relative time embeddings, achieving improved generalization in tasks with timestamp translations and sequence prediction.

Temporal Point Processes (TPPs), especially Hawkes Process are commonly used for modeling asynchronous event sequences data such as financial transactions and user behaviors in social networks. Due to the strong fitting ability of neural networks, various neural Temporal Point Processes are proposed, among which the Neural Hawkes Processes based on self-attention such as Transformer Hawkes Process (THP) achieve distinct performance improvement. Although the THP has gained increasing studies, it still suffers from the {sequence prediction issue}, i.e., training on history sequences and inferencing about the future, which is a prevalent paradigm in realistic sequence analysis tasks. What's more, conventional THP and its variants simply adopt initial sinusoid embedding in transformers, which shows performance sensitivity to temporal change or noise in sequence data analysis by our empirical study. To deal with the problems, we propose a new Rotary Position Embedding-based THP (RoTHP) architecture in this paper. Notably, we show the translation invariance property and {sequence prediction flexibility} of our RoTHP induced by the {relative time embeddings} when coupled with Hawkes process theoretically. Furthermore, we demonstrate empirically that our RoTHP can be better generalized in sequence data scenarios with timestamp translations and in sequence prediction tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes