The end of the road for Kafka-delta-ingest
11 days ago
- #rust
- #delta-lake
- #data-ingestion
- kafka-delta-ingest was shut down after five years at Scribd, having achieved its goals and reduced streaming data ingestion costs by 95%.
- The project led to the creation of delta-rs, a successful open-source project, as an alternative to Apache Spark for Delta Lake table operations.
- Initial development involved a global team during 2020, leveraging Rust for lower operational costs.
- kafka-delta-ingest significantly lowered costs, as highlighted in shared videos by Christian, detailing architecture and streaming systems.
- Decommissioning was due to the development of an even cheaper ingestion process (oxbow suite and medallion architecture), reducing costs to less than 10% of total data platform expenses.
- The necessity of Apache Kafka for kafka-delta-ingest became a drawback as other Kafka consumers at Scribd diminished, reducing its value proposition.
- Maintainers Kyjah Keyes and the author no longer use kafka-delta-ingest but will continue delta-rs upgrades for API testing, with no major expansion plans.