SEJan 9, 2022

A Benchmark of JSON-compatible Binary Serialization Specifications

arXiv:2201.03051v1Has Code
AI Analysis

This work addresses gaps in reproducibility and coverage for developers and researchers working with binary serialization, though it is incremental as it builds on existing benchmarking efforts.

The authors tackled the problem of benchmarking JSON-compatible binary serialization specifications by creating a comprehensive benchmark using over 400 JSON documents, introducing a tiered taxonomy with 36 categories for classification, and providing an online tool for automatic categorization.

We present a comprehensive benchmark of JSON-compatible binary serialization specifications using the SchemaStore open-source test suite collection of over 400 JSON documents matching their respective schemas and representative of their use across industries. We benchmark a set of schema-driven (ASN.1, Apache Avro, Microsoft Bond, Cap'n Proto, FlatBuffers, Protocol Buffers, and Apache Thrift) and schema-less (BSON, CBOR, FlexBuffers, MessagePack, Smile, and UBJSON) JSON-compatible binary serialization specifications. Existing literature on benchmarking JSON-compatible binary serialization specifications demonstrates extensive gaps when it comes to binary serialization specifications coverage, reproducibility and representativity, the role of data compression in binary serialization and the choice and use of obsolete versions of binary serialization specifications. We introduce a tiered taxonomy for JSON documents consisting of 36 categories classified as Tier 1, Tier 2 and Tier 3 as a common basis to class JSON documents based on their size, type of content, characteristics of their structure and redundancy criteria. We built and published a free-to-use online tool to automatically categorize JSON documents according to our taxonomy that generates related summary statistics. In the interest of fairness and transparency, we adhere to reproducible software development standards and publicly host the benchmark software and results on GitHub.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes