HCApr 6

Compass vs Railway Tracks: Unpacking User Mental Models for Communicating Long-Horizon Work to Humans vs. AI

Savvas Petridis, Michael Xieyang Liu, Alexander J. Fiannaca, Carrie J. Cai, Michael Terry

arXiv:2601.1184881.71 citationsh-index: 14

Predicted impact top 3% in HC · last 90 daysOriginality Incremental advance

AI Analysis

This addresses the challenge of human-AI collaboration in ambiguous, long-term tasks, highlighting a need for more flexible AI systems, though it is incremental as it builds on existing prompting research.

The study investigated how professionals communicate specifications for long-horizon tasks to humans versus AI, finding that users provide high-level intent to humans but rigid, exhaustive instructions to AI due to perceived limitations in AI's ability to infer intent and make judgments.

As AI systems grow increasingly capable of operating for hours or days at a time, users' prompts are transforming into elaborate specifications for the AI to autonomously work on. While prompting for bounded, single-turn tasks has been extensively studied, less is known about how people communicate specifications for long-horizon tasks. We conducted a qualitative study in which 16 professionals drafted specifications for both a human colleague and an AI, revealing a core divergence: participants treated human delegation as a "compass", offering high-level intent to encourage flexible exploration. In contrast, communication with AI resembled painstakingly laying down "railway tracks": rigid, exhaustive instructions to minimize ambiguity and deviation. This reflected a perception that current AI struggles to infer intent, prioritize, and make judgments on its own. When envisioning an ideal AI collaborator, users desired a hybrid: a collaborator blending AI's efficiency and large context window with the critical thinking and agency of a human colleague. We discuss design implications for future AI systems, proposing that they align on outcomes through generated rough drafts, verify feasibility via end-to-end "test runs," and monitor execution through intelligent check-ins -- ultimately transforming AI from a passive instruction-follower into a reliable collaborator for ambiguous, long-horizon tasks.

View on arXiv PDF

Similar