Hasty Briefsbeta

Bilingual

4-bit floating point FP4

8 hours ago
  • #Low-Precision Computing
  • #Neural Networks
  • #Floating Point Precision
  • Historically, floating point numbers evolved from 32-bit to 64-bit precision, with C using 'float' for 32-bit and 'double' for 64-bit, while Python uses 'float' for 64-bit.
  • Neural networks drive demand for lower precision floating point formats like 16-bit, 8-bit (FP8), and 4-bit (FP4) to fit more parameters in memory.
  • FP4 uses one bit for sign and three bits for exponent and mantissa, with common formats including E3M0, E2M1, E1M2, and E0M3, where E2M1 is widely supported.
  • The value of an FP4 number is calculated as (−1)^s * 2^(e−b) * (1 + m/2), with a bias (b) to allow positive and negative exponents, except when e=0.
  • A 16-value table for E2M1 format shows representable numbers from -6 to +6, including ±0, illustrating the distribution on log and linear scales.
  • The Pychop library emulates reduced-precision formats like FP4, with code provided to generate and display the E2M1 table of values.
  • Future discussions may cover other 4-bit formats like NF4, which better match the weight distributions in large language models (LLMs).