CVApr 11, 2025

Muon-Accelerated Attention Distillation for Real-Time Edge Synthesis via Optimized Latent Diffusion

arXiv:2504.08451v12 citationsh-index: 1
Originality Highly original
AI Analysis

This work addresses the problem of computational and memory constraints for real-time visual synthesis on edge devices, representing a strong specific gain.

The paper tackled the challenge of real-time deployment of visual synthesis models on edge devices by proposing Muon-AD, a framework that integrates the Muon optimizer with attention distillation, achieving 3.2 times faster convergence, 15% lower FID, and 24FPS real-time generation.

Recent advances in visual synthesis have leveraged diffusion models and attention mechanisms to achieve high-fidelity artistic style transfer and photorealistic text-to-image generation. However, real-time deployment on edge devices remains challenging due to computational and memory constraints. We propose Muon-AD, a co-designed framework that integrates the Muon optimizer with attention distillation for real-time edge synthesis. By eliminating gradient conflicts through orthogonal parameter updates and dynamic pruning, Muon-AD achieves 3.2 times faster convergence compared to Stable Diffusion-TensorRT, while maintaining synthesis quality (15% lower FID, 4% higher SSIM). Our framework reduces peak memory to 7GB on Jetson Orin and enables 24FPS real-time generation through mixed-precision quantization and curriculum learning. Extensive experiments on COCO-Stuff and ImageNet-Texture demonstrate Muon-AD's Pareto-optimal efficiency-quality trade-offs. Here, we show a 65% reduction in communication overhead during distributed training and real-time 10s/image generation on edge GPUs. These advancements pave the way for democratizing high-quality visual synthesis in resource-constrained environments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes