Takashi Hoshino

2papers

2 Papers

32.2PLApr 9
Optimization of 32-bit Unsigned Division by Constants on 64-bit Targets

Shigeo Mitsunari, Takashi Hoshino

Granlund and Montgomery proposed an optimization method for unsigned integer division by constants [3]. Their method (called the GM method in this paper) was further improved in part by works such as [1] and [7], and is now adopted by major compilers including GCC, Clang, Microsoft Compiler, and Apple Clang. However, for example, for x/7, the generated code is designed for 32-bit CPUs and therefore does not fully exploit 64-bit capabilities. This paper proposes an optimization method for 32-bit unsigned division by constants targeting 64-bit CPUs. We implemented patches for LLVM/GCC and achieved speedups of 1.67x on Intel Xeon w9-3495X (Sapphire Rapids) and 1.98x on Apple M4 (Apple M-series SoC) in the microbenchmark described later. The LLVM patch has already been merged into llvm:main [6], demonstrating the practical applicability of the proposed method.

67.5DBMar 18
Shirakami: A Hybrid Concurrency Control Protocol for Tsurugi Relational Database System

Takayuki Tanabe, Shinichi Umegane, Suguru Arakawa et al.

Bill-of-materials and telecommunications billing applications, need to process both short transactions and long read-write transactions simultaneously. Recent work rarely addresses such evolving workloads. To deal with these workloads, we propose a new concurrency control protocol, Shirakami. Shirakami is a hybrid protocol. The first protocol, Shirakami-LTX, is for long read-write transactions based on multiversion view serializability. The second protocol, Shirakami-OCC, is for short transactions based on Silo. Shirakami naturally integrates them with the write-preservation and epoch-based synchronization. It does not require dynamic protocol switching and provides stable performance. We implemented Shirakami as the transaction processing module of the Tsurugi system, which is a production-grade relational database system. The experimental results demonstrated that Tsurugi exhibited 19.7 times lower latency than PostgreSQL, and Shirakami-LTX exhibited 680 times higher throughput than Shirakami-OCC.