Floating-Point Arithmetic on Round-to-Nearest Representations
For hardware designers and computer architects, this work proposes a novel number representation that simplifies rounding and sign inversion, potentially improving performance in floating-point arithmetic units.
The paper introduces a floating-point representation based on RN-representations that enables constant-time rounding and sign inversion, and defines arithmetic operations that preserve rounding information at minimal cost. It details a possible implementation of a floating-point unit supporting this representation.
Recently we introduced a class of number representations denoted RN-representations, allowing an un-biased rounding-to-nearest to take place by a simple truncation. In this paper we briefly review the binary fixed-point representation in an encoding which is essentially an ordinary 2's complement representation with an appended round-bit. Not only is this rounding a constant time operation, so is also sign inversion, both of which are at best log-time operations on ordinary 2's complement representations. Addition, multiplication and division is defined in such a way that rounding information can be carried along in a meaningful way, at minimal cost. Based on the fixed-point encoding we here define a floating point representation, and describe to some detail a possible implementation of a floating point arithmetic unit employing this representation, including also the directed roundings.