Hasty Briefsbeta

CedarDB: The fastest database you've never heard of

11 days ago
  • #database
  • #modern-hardware
  • #performance
  • CedarDB is a modern database designed from scratch to leverage contemporary hardware, unlike traditional databases like Postgres and MySQL which were built over 30 years ago.
  • Originating from the Umbra research project at TUM, CedarDB focuses on performance optimizations for modern multi-core CPUs and large memory capacities.
  • Key innovations include an advanced query optimizer capable of unnesting deeply nested SQL statements, reducing execution time from minutes to seconds.
  • CedarDB employs code generation for SQL queries, converting them into machine code to eliminate interpretation overhead, significantly speeding up query execution.
  • Morsel-driven parallelism ensures efficient CPU core utilization by dynamically distributing small data chunks ('morsels') among cores, keeping all cores busy.
  • The database features a modern buffer manager designed for multi-threaded environments, using Pointer Swizzling to avoid global locks and maximize storage bandwidth.
  • CedarDB's architecture is built to anticipate future hardware changes, with pluggable interfaces for new storage types and workloads, such as vector databases.
  • Adaptive Query Execution allows CedarDB to start executing queries immediately with a basic version while optimizing more complex versions in the background.
  • The database supports sophisticated statistics for the optimizer, enabling early data filtering and accurate estimates for operations like GROUP BY and MIN aggregations.
  • CedarDB is positioned as a 'beyond main memory' system, optimizing for in-memory speeds but gracefully degrading performance when data exceeds RAM capacity.