Social and Scene-Aware Trajectory Prediction in Crowded Spaces
This work addresses trajectory prediction for socially compliant robots or self-driving cars, but it is incremental as it builds on existing LSTM methods with added pooling mechanisms.
The paper tackles the problem of predicting human trajectories in crowded spaces by incorporating social interactions, past navigation, and scene semantics into an LSTM-based model, showing improved performance over baseline LSTM models in unstructured environments.
Mimicking human ability to forecast future positions or interpret complex interactions in urban scenarios, such as streets, shopping malls or squares, is essential to develop socially compliant robots or self-driving cars. Autonomous systems may gain advantage on anticipating human motion to avoid collisions or to naturally behave alongside people. To foresee plausible trajectories, we construct an LSTM (long short-term memory)-based model considering three fundamental factors: people interactions, past observations in terms of previously crossed areas and semantics of surrounding space. Our model encompasses several pooling mechanisms to join the above elements defining multiple tensors, namely social, navigation and semantic tensors. The network is tested in unstructured environments where complex paths emerge according to both internal (intentions) and external (other people, not accessible areas) motivations. As demonstrated, modeling paths unaware of social interactions or context information, is insufficient to correctly predict future positions. Experimental results corroborate the effectiveness of the proposed framework in comparison to LSTM-based models for human path prediction.