CLFeb 8, 2024

TimeArena: Shaping Efficient Multitasking Language Agents in a Time-Aware Simulation

arXiv:2402.05733v135 citationsh-index: 26ACL
Originality Incremental advance
AI Analysis

This addresses the need for better temporal awareness in language agents for real-life planning scenarios, though it is incremental as it builds on existing simulation frameworks.

The paper tackled the problem of inadequate temporal dynamics in textual simulations for language agents by introducing TimeArena, a time-aware environment with 30 real-world tasks, and found that even advanced models like GPT-4 lag behind humans in multitasking efficiency.

Despite remarkable advancements in emulating human-like behavior through Large Language Models (LLMs), current textual simulations do not adequately address the notion of time. To this end, we introduce TimeArena, a novel textual simulated environment that incorporates complex temporal dynamics and constraints that better reflect real-life planning scenarios. In TimeArena, agents are asked to complete multiple tasks as soon as possible, allowing for parallel processing to save time. We implement the dependency between actions, the time duration for each action, and the occupancy of the agent and the objects in the environment. TimeArena grounds to 30 real-world tasks in cooking, household activities, and laboratory work. We conduct extensive experiments with various state-of-the-art LLMs using TimeArena. Our findings reveal that even the most powerful models, e.g., GPT-4, still lag behind humans in effective multitasking, underscoring the need for enhanced temporal awareness in the development of language agents.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes