Hasty Briefsbeta

Bilingual

Techniques to beat Arrays.hashCode(byte[]) using Java's own means

10 months ago
  • #Hashing
  • #Java
  • #Optimization
  • Java's Arrays.hashCode(byte[]) computes a 32-bit hash value for byte arrays.
  • Default implementation in OpenJDK up to version 20 uses a simple loop, which is not highly optimized.
  • Loop unrolling and vectorization techniques (SWAR and SIMD) can significantly speed up hash computation.
  • SWAR-based implementation processes 8 bytes at once using 64-bit long integers and bit manipulation.
  • SIMD-based implementation uses Java's Vector API to process multiple bytes in parallel, outperforming intrinsic implementations.
  • Benchmark results show SWAR is up to 2.9x faster than default, and SIMD is faster than OpenJDK's intrinsic implementation.
  • Both approaches handle remaining bytes separately if the array length is not a multiple of the processing chunk size.
  • The findings suggest OpenJDK's current implementations could be improved with these techniques.