AIDec 3, 2025

PARC: An Autonomous Self-Reflective Coding Agent for Robust Execution of Long-Horizon Tasks

arXiv:2512.03549v12 citationsh-index: 8
Originality Highly original
AI Analysis

This addresses the challenge of enabling AI systems to independently handle large-scale scientific and analytical work, representing a novel method for known bottlenecks in autonomous task execution.

The paper tackles the problem of autonomous execution of long-horizon computational tasks by introducing PARC, a coding agent with a hierarchical multi-agent architecture that includes self-assessment and self-feedback, enabling it to detect and correct errors without human intervention. Results show PARC autonomously reproduces materials science studies requiring 43-hour simulations per task and produces competitive solutions in Kaggle-based data analysis tasks.

We introduce PARC, a coding agent for the autonomous and robust execution of long-horizon computational tasks. PARC is built on a hierarchical multi-agent architecture incorporating task planning, execution, and a mechanism that evaluates its own actions and their outcomes from an independent context and provides feedback, namely self-assessment and self-feedback. This design enables PARC to detect and correct high-level strategic errors and sustain progress without human intervention. We evaluate PARC across computational science and data science tasks. In materials science, it autonomously reproduces key results from studies on lithium-ion conduction and alloy segregation. In particular, it coordinates dozens of parallel simulation tasks, each requiring roughly 43 hours of computation, managing orchestration, monitoring, and error correction end-to-end. In Kaggle-based experiments, starting from minimal natural-language instructions, PARC conducts data analysis and implements search strategies, producing solutions competitive with human-engineered baselines. These results highlight the potential of integrating a hierarchical multi-agent system with self-assessment and self-feedback to enable AI systems capable of independent, large-scale scientific and analytical work.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes