CVNov 24, 2025

Human-Centric Open-Future Task Discovery: Formulation, Benchmark, and Scalable Tree-Based Search

Zijian Song, Xiaoxin Lin, Tao Pu, Zhenlong Yuan, Guangrun Wang, Liang Lin

arXiv:2511.18929v32 citations

Originality Incremental advance

AI Analysis

This addresses a key challenge in robotics and embodied AI for advancing LMMs to assist humans in dynamic, open-future environments, representing a domain-specific incremental contribution.

The paper tackles the problem of enabling Large Multimodal Models (LMMs) to discover tasks that assist humans in open-future scenarios, proposing the HOTD-Bench benchmark and CMAST framework, which achieves the best performance on the benchmark and improves existing LMMs.

Recent progress in robotics and embodied AI is largely driven by Large Multimodal Models (LMMs). However, a key challenge remains underexplored: how can we advance LMMs to discover tasks that assist humans in open-future scenarios, where human intentions are highly concurrent and dynamic. In this work, we formalize the problem of Human-centric Open-future Task Discovery (HOTD), focusing particularly on identifying tasks that reduce human effort across plausible futures. To facilitate this study, we propose HOTD-Bench, which features over 2K real-world videos, a semi-automated annotation pipeline, and a simulation-based protocol tailored for open-set future evaluation. Additionally, we propose the Collaborative Multi-Agent Search Tree (CMAST) framework, which decomposes complex reasoning through a multi-agent system and structures the reasoning process through a scalable search tree module. In our experiments, CMAST achieves the best performance on the HOTD-Bench, significantly surpassing existing LMMs. It also integrates well with existing LMMs, consistently improving performance.

View on arXiv PDF

Similar