ARLGMar 24

Energy Efficient Software Hardware CoDesign for Machine Learning: From TinyML to Large Language Models

arXiv:2603.2366860.51 citationsh-index: 3
AI Analysis

It addresses energy efficiency as a critical constraint for sustainable AI across diverse platforms, though it is incremental as a review paper.

This work reviews energy-efficient software-hardware co-design methods for machine learning systems, from TinyML to large language models, highlighting common design levers, trade-offs, and gaps such as limited cross-platform generalization and inconsistent benchmarking.

The rapid deployment of machine learning across platforms from milliwatt-class TinyML devices to large language models has made energy efficiency a primary constraint for sustainable AI. Across these scales, performance and energy are increasingly limited by data movement and memory-system behavior rather than by arithmetic throughput alone. This work reviews energy efficient software hardware codesign methods spanning edge inference and training to datacenter-scale LLM serving, covering accelerator architectures (e.g., ASIC/FPGA dataflows, processing-/compute-in-memory designs) and system-level techniques (e.g., partitioning, quantization, scheduling, and runtime adaptation). We distill common design levers and trade-offs, and highlight recurring gaps including limited cross-platform generalization, large and costly co-design search spaces, and inconsistent benchmarking across workloads and deployment settings. Finally, we outline a hierarchical decomposition perspective that maps optimization strategies to computational roles and supports incremental adaptation, offering practical guidance for building energy and carbon aware ML systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes