Encoding Agent Trajectories as Representations with Sequence Transformers
This work addresses the problem of analyzing agent trajectories in spatiotemporal data for applications in fields like urban planning or logistics, but it is incremental as it adapts existing Transformer methods to a new domain.
The paper tackles the challenge of representing high-dimensional spatiotemporal trajectories by proposing a Transformer-based model (STARE) that learns encodings useful for downstream tasks like classification and similarity analysis, showing effectiveness on synthetic and real datasets.
Spatiotemporal data faces many analogous challenges to natural language text including the ordering of locations (words) in a sequence, long range dependencies between locations, and locations having multiple meanings. In this work, we propose a novel model for representing high dimensional spatiotemporal trajectories as sequences of discrete locations and encoding them with a Transformer-based neural network architecture. Similar to language models, our Sequence Transformer for Agent Representation Encodings (STARE) model can learn representations and structure in trajectory data through both supervisory tasks (e.g., classification), and self-supervisory tasks (e.g., masked modelling). We present experimental results on various synthetic and real trajectory datasets and show that our proposed model can learn meaningful encodings that are useful for many downstream tasks including discriminating between labels and indicating similarity between locations. Using these encodings, we also learn relationships between agents and locations present in spatiotemporal data.