CVAIROSep 12, 2025

OnlineHOI: Towards Online Human-Object Interaction Generation and Perception

arXiv:2509.12250v11 citationsh-index: 4MM
Originality Incremental advance
AI Analysis

This addresses a critical gap for robotics, AR/VR, and behavior understanding by enabling real-time HOI processing, though it is incremental as it adapts existing methods to a new setting.

The paper tackles the problem of modeling Human-Object Interaction (HOI) in real-time online settings, where current offline methods perform poorly, and proposes the OnlineHOI framework based on Mamba with a memory mechanism, achieving state-of-the-art results on tasks like Core4D and OAKINK2 generation and HOI4D perception.

The perception and generation of Human-Object Interaction (HOI) are crucial for fields such as robotics, AR/VR, and human behavior understanding. However, current approaches model this task in an offline setting, where information at each time step can be drawn from the entire interaction sequence. In contrast, in real-world scenarios, the information available at each time step comes only from the current moment and historical data, i.e., an online setting. We find that offline methods perform poorly in an online context. Based on this observation, we propose two new tasks: Online HOI Generation and Perception. To address this task, we introduce the OnlineHOI framework, a network architecture based on the Mamba framework that employs a memory mechanism. By leveraging Mamba's powerful modeling capabilities for streaming data and the Memory mechanism's efficient integration of historical information, we achieve state-of-the-art results on the Core4D and OAKINK2 online generation tasks, as well as the online HOI4D perception task.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes