IRFeb 20, 2015

Vectorized VByte Decoding

Jeff Plaisance, Nathan Kurz, Daniel Lemire

arXiv:1503.07387v412.232 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses a performance issue in data compression for search engines and other systems that rely on fast integer decoding, making VByte competitive again in speed-critical applications.

The paper tackled the performance bottleneck of VByte decoding, which suffers from branch mispredictions due to variable-length integers, by introducing a vectorized decoder (Masked VByte) that is 2 to 4 times faster than conventional scalar decoders.

We consider the ubiquitous technique of VByte compression, which represents each integer as a variable length sequence of bytes. The low 7 bits of each byte encode a portion of the integer, and the high bit of each byte is reserved as a continuation flag. This flag is set to 1 for all bytes except the last, and the decoding of each integer is complete when a byte with a high bit of 0 is encountered. VByte decoding can be a performance bottleneck especially when the unpredictable lengths of the encoded integers cause frequent branch mispredictions. Previous attempts to accelerate VByte decoding using SIMD vector instructions have been disappointing, prodding search engines such as Google to use more complicated but faster-to-decode formats for performance-critical code. Our decoder (Masked VByte) is 2 to 4 times faster than a conventional scalar VByte decoder, making the format once again competitive with regard to speed.

View on arXiv PDF Code

Similar