Apache Spark 4.0
a year ago
- #Open Source
- #Apache Spark
- #Big Data
- Apache Spark 4.0.0 is the first release in the 4.x series, with contributions from over 390 individuals resolving 5100+ tickets.
- Key improvements in Spark Connect, Spark SQL, PySpark, and Structured Streaming enhance functionality and developer experience.
- New features include VARIANT data type support, SQL user-defined functions, native plotting API, and Arbitrary State API v2.
- Significant updates to dependencies and libraries, including upgrades to Hadoop, Hive, and various Java libraries.
- The release includes numerous API enhancements, bug fixes, and performance optimizations across all modules.