Hasty Briefsbeta

Stringwa.rs on GPUs: Databases and Bioinformatics

12 hours ago
  • #CUDA
  • #bioinformatics
  • #string-processing
  • StringZilla v4 is now CUDA-capable, making it fast on both CPUs and GPUs.
  • The release includes GPU-accelerated string similarity kernels for Levenshtein distances and bioinformatics applications.
  • New features include non-cryptographic hash functions, string PRNGs, and sorting algorithms for large string collections.
  • Performance benchmarks show significant speed improvements over existing libraries like NLTK and cudf.
  • The library supports dynamic dispatch for different architectures and is available for multiple programming languages.
  • MinHash signatures are now computed using 52-bit integers, optimized for both CPU and GPU performance.
  • StringZilla v4 also introduces high-throughput random string generation using AES instructions.
  • The release includes optimizations for batch operations like sorting, with significant speedups over standard libraries.