Home COMSC-171 <- Prev Next ->

Numeric Data

Representations

unsigned integers
8-bit, 16-bit, 32-bit, 64-bit, others (implemented in hardware)
unlimited length (implemented in software)
signed integers (two's complement)
positive integers: high bit 0
zero: all bits 0
negative integers: high bit 1
negation: flip all bits, add 1
BCD (binary coded decimal)
1 decimal digit (0000 through 1001) in 4 bits
+ (1100) or - (1101) sign in last 4 bits
variations exist
IEEE 754-2019 floating point
value = -1sign x significand x baseexponent
binary16 (half): 1 sign bit, 4 exponent bits, 11 significand bits
binary32 (single): 1 sign bit, 8 exponent bits, 23 significand bits
binary64 (double): 1 sign bit, 11 exponent bits, 52 significand bits
binary128 (quadruple): 1 sign bit, 15 exponent bits, 113 significand bits
binary256 (octuple): 1 sign bit, 19 exponent bits, 237 significand bits
also decimal32, decimal64, decimal128
NaN (not a number): all exponent bits 1, not all significand bits 0
infinity (too large): all exponent bits 1, all significand bits 0
subnormal (too close to zero): all exponent bits 0
+0 and -0 are different

Errors

integer overflow
excess leading bits of result are truncated with no warning
floating point roundoff and truncation
many values cannot be represented exactly with a finite number of bits
errors can accumulate in repeated calculations with no warning