Hasty Briefsbeta

Bilingual

DuckLake v1.0

2 days ago
  • #database
  • #lakehouse
  • #data-lake
  • DuckLake v1.0 is a production-ready lakehouse format specification that stores metadata in a database rather than scattered files in object storage.
  • The DuckDB ducklake extension serves as the reference implementation, supporting SQLite, PostgreSQL, and DuckDB as catalogs, and is now among DuckDB's top-10 core extensions.
  • Key features in v1.0 include data inlining for small operations, sorted tables for performance, bucket partitioning, geometry and variant type support, and experimental deletion vectors.
  • Community adoption includes clients for Apache DataFusion, Apache Spark, Trino, and Pandas, with production use at dozens of companies and a hosted service from MotherDuck.
  • Future plans for DuckLake v1.1 include variant inlining and multi-deletion vector puffin files, while v2.0 may focus on Git-like branching, permission-based roles, and incremental materialized views.