54.0HCMay 5
Code Semantic ZoomingJinsheng Ba, Sverrir Thorgeirsson, Zhendong Su
Recent advances in Large Language Models (LLMs) have introduced a new paradigm for software development, where source code is generated from natural language prompts. While this paradigm significantly boosts development productivity, building complex, real-world software systems remains challenging because natural language offers limited control over the code generation process. Inspired by the historical evolution of programming languages toward higher levels of abstraction, we advocate for a high-level abstraction language that gives developers greater control over LLM-assisted code writing. To this end, we propose Code Semantic Zooming (CodeZoom), a novel approach based on pseudocode that allows developers to iteratively explore, understand, and refine code across multiple layers of semantic abstraction. In a within-subjects user study (n=26), our method matches a state-of-the-art coding agent, Claude Code, on usability while producing a large effect on code comprehension: over 90% of participants reported feeling more in control of design decisions when using CodeZoom compared to using Claude Code.
27.7SEMay 21
Finding Performance Issues in Database Systems by Exploiting Dormant Code PathsJinsheng Ba, Zhendong Su
Performance is a critical characteristic of fundamental systems, such as Database Management Systems (DBMSs). Both academia and industry have invested decades in exploring efficient optimization algorithms. Despite these efforts, DBMSs are prone to performance issues, which incur suboptimal performance. Finding such issues is a longstanding challenge as no ground-truth performance is available. Existing work adopts black-box methods to examine performance consistency across executions, but cannot systematically test optimizations. In this work, we propose a novel, general white-box methodology, Branch Flip Analysis (BFA), to systematically and effectively uncover performance issues. BFA flips code branches to enforce or disable an optimization, and the performance is expected to be not significantly better. Otherwise, a performance issue exists. BFA provides a new perspective to finding performance issues and testing optimization logics in a fine-grained manner. We realized BFA in a prototype system QueryZen, and evaluated it on four widely-used and mature DBMSs: PostgreSQL, MySQL, CockroachDB, and MariaDB. QueryZen found 21 previously unknown and unique performance issues with the workload of the extensively used benchmarks TPC-H and TPC-DS. The core concept of BFA is simple and broadly applicable, and can be adapted to analyze the performance of other software systems.