SEApr 29, 2021

The Behavioral Diversity of Java JSON Libraries

Nicolas Harrand, Thomas Durieux, David Broman, Benoit Baudry

arXiv:2104.14323v26.41 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the lack of software engineering perspective in JSON library comparisons, providing insights for developers to improve robustness in handling ill-formed data.

The study conducted the first systematic analysis of input/output behavior across 20 Java JSON libraries using 473 JSON files, revealing significant behavioral diversity, especially with ill-formed files and corner cases like large numbers or duplicates.

JSON is an essential file and data format in do-mains that span scientific computing, web APIs or configuration management. Its popularity has motivated significant software development effort to build multiple libraries to process JSON data. Previous studies focus on performance comparison among these libraries and lack a software engineering perspective.We present the first systematic analysis and comparison of the input / output behavior of 20 JSON libraries, in a single software ecosystem: Java/Maven. We assess behavior diversity by running each library against a curated set of 473 JSON files, including both well-formed and ill-formed files. The main design differences, which influence the behavior of the libraries, relate to the choice of data structure to represent JSON objects and to the encoding of numbers. We observe a remarkable behavioral diversity with ill-formed files, or corner cases such as large numbers or duplicate data. Our unique behavioral assessment of JSON libraries paves the way for a robust processing of ill-formed files, through a multi-version architecture.

View on arXiv PDF

Similar