SEFeb 11, 2022

Towards Build Verifiability for Java-based Systems

arXiv:2202.05906v115 citationsHas Code
Originality Incremental advance
AI Analysis

This addresses the lack of systematic build verifiability techniques for Java-based systems, which is crucial for trustworthiness in software development, though it is incremental relative to existing work on C/C++ systems.

The study tackled the problem of build verifiability for Java-based systems by developing a systematic approach involving a unified build process, dynamic control of non-determinism, and post-processing of artifacts, achieving verification for 91% of unverified open-source projects and 100% of commercially adopted ones.

Build verifiability refers to the property that the build of a software system can be verified by independent third parties and it is crucial for the trustworthiness of a software system. Various efforts towards build verifiability have been made to C/C++-based systems, yet the techniques for Java-based systems are not systematic and are often specific to a particular build tool (e.g., Maven). In this study, we present a systematic approach towards build verifiability on Java-based systems. Our approach consists of three parts: a unified build process, a tool that dynamically controls non-determinism during the build process, and another tool that eliminates non-equivalences by post-processing the build artifacts. We apply our approach on 46 unverified open source projects from Reproducible Central and 13 open source projects that are widely used by Huawei commercial products. As a result, 91% of the unverified Reproducible Central projects and 100% of the commercially adopted OSS projects are successfully verified with our approach. In addition, based on our experience in analyzing thousands of builds for both commercial and open source Java-based systems, we present 14 patterns that introduce non-equivalences in generated build artifacts and their respective mitigation strategies. Among these patterns, 11 (78%) are unique for Java-based system, whereas the remaining 3 (22%) are common with C/C++-based systems. The approach and the findings of this paper are useful for both practitioners and researchers who are interested in build verifiability.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes