Zesong Jiang, Yuqi Sun, Qing Zhong et al.
Designing optimal Coarse-Grained Reconfigurable Arrays (CGRAs) requires navigating a vast, interdependent hardware/software space bottlenecked by costly manual iteration. We present MACO, an open-source, multi-agent LLM framework that automates CGRA HW/SW co-design. MACO decomposes the design loop into four collaborative stages, HW/SW Co-design, Error Correction, Best-Design Selection, and Evaluation & Feedback, to iteratively optimize power, performance, and area (PPA). To accelerate convergence and efficiently traverse the design space, MACO introduces an exponentially decaying exploration strategy, EDA-guided LLM self-learning, and robust rule-based error correction. Evaluated against state-of-the-art baselines, MACO reduces power consumption by 25.9%, improves performance by 20.0%, and accelerates the search process by 5x. Finally, we validate MACO's physical design through a complete 7nm ASIC design flow.