LGMAJul 22, 2025

Multi-Agent Reinforcement Learning for Sample-Efficient Deep Neural Network Mapping

arXiv:2507.16249v1h-index: 46
Originality Incremental advance
AI Analysis

This addresses the challenge of optimizing hardware mapping for deep neural networks, which is critical for high-performance accelerator design, by providing a more efficient method, though it appears incremental as it builds on existing RL approaches.

The paper tackles the sample inefficiency problem in reinforcement learning for mapping deep neural networks to hardware by proposing a decentralized multi-agent reinforcement learning framework with agent clustering, resulting in 30-300x sample efficiency improvement, up to 32.61x latency reduction, and 16.45x energy-delay product reduction.

Mapping deep neural networks (DNNs) to hardware is critical for optimizing latency, energy consumption, and resource utilization, making it a cornerstone of high-performance accelerator design. Due to the vast and complex mapping space, reinforcement learning (RL) has emerged as a promising approach-but its effectiveness is often limited by sample inefficiency. We present a decentralized multi-agent reinforcement learning (MARL) framework designed to overcome this challenge. By distributing the search across multiple agents, our framework accelerates exploration. To avoid inefficiencies from training multiple agents in parallel, we introduce an agent clustering algorithm that assigns similar mapping parameters to the same agents based on correlation analysis. This enables a decentralized, parallelized learning process that significantly improves sample efficiency. Experimental results show our MARL approach improves sample efficiency by 30-300x over standard single-agent RL, achieving up to 32.61x latency reduction and 16.45x energy-delay product (EDP) reduction under iso-sample conditions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes