SEMay 19

Characterizing Real-World Bugs in Tile Programs for Automated Bug Detection

arXiv:2605.1965270.8
AI Analysis

For developers and researchers working with tile-based compilers for high-performance GPU kernels, this study addresses the unexplored area of code generation bugs, which are difficult to detect and fix.

This paper presents the first systematic study of code generation bugs in tile-based programming frameworks, analyzing 301 bugs from 401 reports to categorize root causes, symptoms, and fix strategies, providing foundational insights for debugging and testing tools.

Tile-based programming frameworks are increasingly adopted to write high-performance GPU kernels in domains such as deep learning and scientific computing. While these frameworks enhance productivity and hardware utilization, their multi-stage compilation pipelines introduce distinct code generation bugs that are tightly coupled to input shapes, data types, and backend targets. These bugs often manifest as silent correctness or performance issues, making them difficult to detect using existing compiler testing tools. Additionally, the unique programming conventions of tile domain-specific languages complicate root cause identification, while fixing such bugs demands specialized knowledge of tile abstractions and compilation pipelines. Despite the growing adoption of tile-based systems, their code generation bugs remain largely unexplored. This paper presents the first systematic study of tile-program code generation bugs. We curate 401 bug reports from GitHub and identify 301 tile-program codegen bugs for analysis, categorizing the root causes, symptoms, input patterns, test oracles that trigger these bugs, and the strategies used to fix bugs. Our study provides foundational insights for building debugging, testing, and repair tools tailored to tile-based compiler infrastructures.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes