ARLGIVMay 9, 2022

A Real Time Super Resolution Accelerator with Tilted Layer Fusion

arXiv:2205.03997v16 citationsh-index: 30
Originality Incremental advance
AI Analysis

This work addresses the problem of high computational and memory demands for superresolution in mobile devices, representing an incremental improvement in hardware efficiency.

The paper tackled the challenge of deploying deep learning-based superresolution on mobile devices by proposing a hardware accelerator with tilted layer fusion, which reduces external DRAM bandwidth by 92% and requires only 102KB on-chip memory, achieving 1920x1080@60fps throughput with 544.3K gate count at 600MHz.

Deep learning based superresolution achieves high-quality results, but its heavy computational workload, large buffer, and high external memory bandwidth inhibit its usage in mobile devices. To solve the above issues, this paper proposes a real-time hardware accelerator with the tilted layer fusion method that reduces the external DRAM bandwidth by 92\% and just needs 102KB on-chip memory. The design implemented with a 40nm CMOS process achieves 1920x1080@60fps throughput with 544.3K gate count when running at 600MHz; it has higher throughput and lower area cost than previous designs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes