Class FP16

java.lang.Object
icyllis.arc3d.core.FP16

public final class FP16 extends Object

The FP16 class is a wrapper and a utility class to manipulate half-precision 16-bit IEEE 754 floating point data types (also called fp16 or binary16). A half-precision float can be created from or converted to single-precision floats, and is stored in a short data type.

The IEEE 754 standard specifies an fp16 as having the following format:

  • Sign bit: 1 bit
  • Exponent width: 5 bits
  • Significand: 10 bits

The format is laid out as follows:

 1   11111   1111111111
 ^   --^--   -----^----
 sign  |          |_______ significand
       |
       -- exponent
 

Half-precision floating points can be useful to save memory and/or bandwidth at the expense of range and precision when compared to single-precision floating points (fp32).

To help you decide whether fp16 is the right storage type for you need, please refer to the table below that shows the available precision throughout the range of possible values. The precision column indicates the step size between two consecutive numbers in a specific part of the range.

Range startPrecision
01 ⁄ 16,777,216
1 ⁄ 16,3841 ⁄ 16,777,216
1 ⁄ 8,1921 ⁄ 8,388,608
1 ⁄ 4,0961 ⁄ 4,194,304
1 ⁄ 2,0481 ⁄ 2,097,152
1 ⁄ 1,0241 ⁄ 1,048,576
1 ⁄ 5121 ⁄ 524,288
1 ⁄ 2561 ⁄ 262,144
1 ⁄ 1281 ⁄ 131,072
1 ⁄ 641 ⁄ 65,536
1 ⁄ 321 ⁄ 32,768
1 ⁄ 161 ⁄ 16,384
1 ⁄ 81 ⁄ 8,192
1 ⁄ 41 ⁄ 4,096
1 ⁄ 21 ⁄ 2,048
11 ⁄ 1,024
21 ⁄ 512
41 ⁄ 256
81 ⁄ 128
161 ⁄ 64
321 ⁄ 32
641 ⁄ 16
1281 ⁄ 8
2561 ⁄ 4
5121 ⁄ 2
1,0241
2,0482
4,0964
8,1928
16,38416
32,76832

This table shows that numbers higher than 1024 lose all fractional precision.

  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static final int
    The number of bytes used to represent a half-precision float value.
    static final short
    Epsilon is the difference between 1.0 and the next value representable by a half-precision floating-point.
    static final int
    The offset of the exponent from the actual value.
    static final int
    The offset to shift by to obtain the exponent bits.
    static final int
    The bitmask to AND with to obtain exponent and significand bits.
    static final short
    Smallest negative value a half-precision float may have.
    static final int
    Maximum exponent a finite half-precision float may have.
    static final short
    Maximum positive finite value a half-precision float may have.
    static final int
    Minimum exponent a normalized half-precision float may have.
    static final short
    Smallest positive normal value a half-precision float may have.
    static final short
    Smallest positive non-zero value a half-precision float may have.
    static final short
    A Not-a-Number representation of a half-precision float.
    static final short
    Negative infinity of type half-precision float.
    static final short
    Negative 0 of type half-precision float.
    static final short
    Positive infinity of type half-precision float.
    static final short
    Positive 0 of type half-precision float.
    static final int
    The bitmask to AND a number shifted by EXPONENT_SHIFT right, to obtain exponent bits.
    static final int
    The bitmask to AND a number with to obtain the sign bit.
    static final int
    The offset to shift by to obtain the sign bit.
    static final int
    The bitmask to AND a number with to obtain significand bits.
    static final int
    The number of bits used to represent a half-precision float value.
  • Method Summary

    Modifier and Type
    Method
    Description
    static short
    ceil(short h)
    Returns the smallest half-precision float value toward negative infinity greater than or equal to the specified half-precision float value.
    static int
    compare(short x, short y)
    Compares the two specified half-precision float values.
    static boolean
    equals(short x, short y)
    Returns true if the two half-precision float values are equal.
    static short
    floor(short h)
    Returns the largest half-precision float value toward positive infinity less than or equal to the specified half-precision float value.
    static boolean
    greater(short x, short y)
    Returns true if the first half-precision float value is greater (larger toward positive infinity) than the second half-precision float value.
    static boolean
    greaterEquals(short x, short y)
    Returns true if the first half-precision float value is greater (larger toward positive infinity) than or equal to the second half-precision float value.
    static boolean
    isInfinite(short h)
    Returns true if the specified half-precision float value represents infinity, false otherwise.
    static boolean
    isNaN(short h)
    Returns true if the specified half-precision float value represents a Not-a-Number, false otherwise.
    static boolean
    isNormalized(short h)
    Returns true if the specified half-precision float value is normalized (does not have a subnormal representation).
    static boolean
    less(short x, short y)
    Returns true if the first half-precision float value is less (smaller toward negative infinity) than the second half-precision float value.
    static boolean
    lessEquals(short x, short y)
    Returns true if the first half-precision float value is less (smaller toward negative infinity) than or equal to the second half-precision float value.
    static short
    max(short x, short y)
    Returns the larger of two half-precision float values (the value closest to positive infinity).
    static short
    min(short x, short y)
    Returns the smaller of two half-precision float values (the value closest to negative infinity).
    static short
    rint(short h)
    Returns the closest integral half-precision float value to the specified half-precision float value.
    static float
    toFloat(short h)
    Converts the specified half-precision float value into a single-precision float value.
    static short
    toHalf(float f)
    Converts the specified single-precision float value into a half-precision float value.
    static String
    toHexString(short h)
    Returns a hexadecimal string representation of the specified half-precision float value.
    static short
    trunc(short h)
    Returns the truncated half-precision float value of the specified half-precision float value.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • SIZE

      public static final int SIZE
      The number of bits used to represent a half-precision float value.
      See Also:
    • BYTES

      public static final int BYTES
      The number of bytes used to represent a half-precision float value.
      See Also:
    • EPSILON

      public static final short EPSILON
      Epsilon is the difference between 1.0 and the next value representable by a half-precision floating-point.
      See Also:
    • MAX_EXPONENT

      public static final int MAX_EXPONENT
      Maximum exponent a finite half-precision float may have.
      See Also:
    • MIN_EXPONENT

      public static final int MIN_EXPONENT
      Minimum exponent a normalized half-precision float may have.
      See Also:
    • LOWEST_VALUE

      public static final short LOWEST_VALUE
      Smallest negative value a half-precision float may have.
      See Also:
    • MAX_VALUE

      public static final short MAX_VALUE
      Maximum positive finite value a half-precision float may have.
      See Also:
    • MIN_NORMAL

      public static final short MIN_NORMAL
      Smallest positive normal value a half-precision float may have.
      See Also:
    • MIN_VALUE

      public static final short MIN_VALUE
      Smallest positive non-zero value a half-precision float may have.
      See Also:
    • NaN

      public static final short NaN
      A Not-a-Number representation of a half-precision float.
      See Also:
    • NEGATIVE_INFINITY

      public static final short NEGATIVE_INFINITY
      Negative infinity of type half-precision float.
      See Also:
    • NEGATIVE_ZERO

      public static final short NEGATIVE_ZERO
      Negative 0 of type half-precision float.
      See Also:
    • POSITIVE_INFINITY

      public static final short POSITIVE_INFINITY
      Positive infinity of type half-precision float.
      See Also:
    • POSITIVE_ZERO

      public static final short POSITIVE_ZERO
      Positive 0 of type half-precision float.
      See Also:
    • SIGN_SHIFT

      public static final int SIGN_SHIFT
      The offset to shift by to obtain the sign bit.
      See Also:
    • EXPONENT_SHIFT

      public static final int EXPONENT_SHIFT
      The offset to shift by to obtain the exponent bits.
      See Also:
    • SIGN_MASK

      public static final int SIGN_MASK
      The bitmask to AND a number with to obtain the sign bit.
      See Also:
    • SHIFTED_EXPONENT_MASK

      public static final int SHIFTED_EXPONENT_MASK
      The bitmask to AND a number shifted by EXPONENT_SHIFT right, to obtain exponent bits.
      See Also:
    • SIGNIFICAND_MASK

      public static final int SIGNIFICAND_MASK
      The bitmask to AND a number with to obtain significand bits.
      See Also:
    • EXPONENT_SIGNIFICAND_MASK

      public static final int EXPONENT_SIGNIFICAND_MASK
      The bitmask to AND with to obtain exponent and significand bits.
      See Also:
    • EXPONENT_BIAS

      public static final int EXPONENT_BIAS
      The offset of the exponent from the actual value.
      See Also:
  • Method Details

    • compare

      public static int compare(short x, short y)

      Compares the two specified half-precision float values. The following conditions apply during the comparison:

      • NaN is considered by this method to be equal to itself and greater than all other half-precision float values (including #POSITIVE_INFINITY)
      • POSITIVE_ZERO is considered by this method to be greater than NEGATIVE_ZERO.
      Parameters:
      x - The first half-precision float value to compare.
      y - The second half-precision float value to compare
      Returns:
      The value 0 if x is numerically equal to y, a value less than 0 if x is numerically less than y, and a value greater than 0 if x is numerically greater than y
    • rint

      public static short rint(short h)
      Returns the closest integral half-precision float value to the specified half-precision float value. Special values are handled in the following ways:
      • If the specified half-precision float is NaN, the result is NaN
      • If the specified half-precision float is infinity (negative or positive), the result is infinity (with the same sign)
      • If the specified half-precision float is zero (negative or positive), the result is zero (with the same sign)
      Parameters:
      h - A half-precision float value
      Returns:
      The value of the specified half-precision float rounded to the nearest half-precision float value
    • ceil

      public static short ceil(short h)
      Returns the smallest half-precision float value toward negative infinity greater than or equal to the specified half-precision float value. Special values are handled in the following ways:
      • If the specified half-precision float is NaN, the result is NaN
      • If the specified half-precision float is infinity (negative or positive), the result is infinity (with the same sign)
      • If the specified half-precision float is zero (negative or positive), the result is zero (with the same sign)
      Parameters:
      h - A half-precision float value
      Returns:
      The smallest half-precision float value toward negative infinity greater than or equal to the specified half-precision float value
    • floor

      public static short floor(short h)
      Returns the largest half-precision float value toward positive infinity less than or equal to the specified half-precision float value. Special values are handled in the following ways:
      • If the specified half-precision float is NaN, the result is NaN
      • If the specified half-precision float is infinity (negative or positive), the result is infinity (with the same sign)
      • If the specified half-precision float is zero (negative or positive), the result is zero (with the same sign)
      Parameters:
      h - A half-precision float value
      Returns:
      The largest half-precision float value toward positive infinity less than or equal to the specified half-precision float value
    • trunc

      public static short trunc(short h)
      Returns the truncated half-precision float value of the specified half-precision float value. Special values are handled in the following ways:
      • If the specified half-precision float is NaN, the result is NaN
      • If the specified half-precision float is infinity (negative or positive), the result is infinity (with the same sign)
      • If the specified half-precision float is zero (negative or positive), the result is zero (with the same sign)
      Parameters:
      h - A half-precision float value
      Returns:
      The truncated half-precision float value of the specified half-precision float value
    • min

      public static short min(short x, short y)
      Returns the smaller of two half-precision float values (the value closest to negative infinity). Special values are handled in the following ways:
      Parameters:
      x - The first half-precision value
      y - The second half-precision value
      Returns:
      The smaller of the two specified half-precision values
    • max

      public static short max(short x, short y)
      Returns the larger of two half-precision float values (the value closest to positive infinity). Special values are handled in the following ways:
      Parameters:
      x - The first half-precision value
      y - The second half-precision value
      Returns:
      The larger of the two specified half-precision values
    • less

      public static boolean less(short x, short y)
      Returns true if the first half-precision float value is less (smaller toward negative infinity) than the second half-precision float value. If either of the values is NaN, the result is false.
      Parameters:
      x - The first half-precision value
      y - The second half-precision value
      Returns:
      True if x is less than y, false otherwise
    • lessEquals

      public static boolean lessEquals(short x, short y)
      Returns true if the first half-precision float value is less (smaller toward negative infinity) than or equal to the second half-precision float value. If either of the values is NaN, the result is false.
      Parameters:
      x - The first half-precision value
      y - The second half-precision value
      Returns:
      True if x is less than or equal to y, false otherwise
    • greater

      public static boolean greater(short x, short y)
      Returns true if the first half-precision float value is greater (larger toward positive infinity) than the second half-precision float value. If either of the values is NaN, the result is false.
      Parameters:
      x - The first half-precision value
      y - The second half-precision value
      Returns:
      True if x is greater than y, false otherwise
    • greaterEquals

      public static boolean greaterEquals(short x, short y)
      Returns true if the first half-precision float value is greater (larger toward positive infinity) than or equal to the second half-precision float value. If either of the values is NaN, the result is false.
      Parameters:
      x - The first half-precision value
      y - The second half-precision value
      Returns:
      True if x is greater than y, false otherwise
    • equals

      public static boolean equals(short x, short y)
      Returns true if the two half-precision float values are equal. If either of the values is NaN, the result is false. POSITIVE_ZERO and NEGATIVE_ZERO are considered equal.
      Parameters:
      x - The first half-precision value
      y - The second half-precision value
      Returns:
      True if x is equal to y, false otherwise
    • isInfinite

      public static boolean isInfinite(short h)
      Returns true if the specified half-precision float value represents infinity, false otherwise.
      Parameters:
      h - A half-precision float value
      Returns:
      True if the value is positive infinity or negative infinity, false otherwise
    • isNaN

      public static boolean isNaN(short h)
      Returns true if the specified half-precision float value represents a Not-a-Number, false otherwise.
      Parameters:
      h - A half-precision float value
      Returns:
      True if the value is a NaN, false otherwise
    • isNormalized

      public static boolean isNormalized(short h)
      Returns true if the specified half-precision float value is normalized (does not have a subnormal representation). If the specified value is POSITIVE_INFINITY, NEGATIVE_INFINITY, POSITIVE_ZERO, NEGATIVE_ZERO, NaN or any subnormal number, this method returns false.
      Parameters:
      h - A half-precision float value
      Returns:
      True if the value is normalized, false otherwise
    • toFloat

      public static float toFloat(short h)

      Converts the specified half-precision float value into a single-precision float value. The following special cases are handled:

      Parameters:
      h - The half-precision float value to convert to single-precision
      Returns:
      A normalized single-precision float value
    • toHalf

      public static short toHalf(float f)

      Converts the specified single-precision float value into a half-precision float value. The following special cases are handled:

      Parameters:
      f - The single-precision float value to convert to half-precision
      Returns:
      A half-precision float value
    • toHexString

      public static String toHexString(short h)

      Returns a hexadecimal string representation of the specified half-precision float value. If the value is a NaN, the result is "NaN", otherwise the result follows this format:

      • If the sign is positive, no sign character appears in the result
      • If the sign is negative, the first character is '-'
      • If the value is inifinity, the string is "Infinity"
      • If the value is 0, the string is "0x0.0p0"
      • If the value has a normalized representation, the exponent and significand are represented in the string in two fields. The significand starts with "0x1." followed by its lowercase hexadecimal representation. Trailing zeroes are removed unless all digits are 0, then a single zero is used. The significand representation is followed by the exponent, represented by "p", itself followed by a decimal string of the unbiased exponent
      • If the value has a subnormal representation, the significand starts with "0x0." followed by its lowercase hexadecimal representation. Trailing zeroes are removed unless all digits are 0, then a single zero is used. The significand representation is followed by the exponent, represented by "p-14"
      Parameters:
      h - A half-precision float value
      Returns:
      A hexadecimal string representation of the specified value