Parallelizing ClickHouse aggregation merge for fixed hash map
4 days ago
- #Database Optimization
- #Parallel Processing
- #ClickHouse
- Performance variation in ClickHouse queries due to different GROUP BY key types (UInt16 vs UInt64).
- Aggregation in ClickHouse uses array for hashmap with small keys (UInt16) and standard/two-level hashmap for larger keys.
- Parallel merge optimization for fixed hash map aggregation by working on disjoint subsets of group keys.
- Initial range-based segmentation for parallel merge was ineffective; alternative distribution improved performance.
- Encountered memory corruption due to thread-unsafe Arena usage during aggregation merge.
- DB::Arena's fast memory allocation strategy for short-lived objects in query execution.
- Performance degradation with trivial aggregation functions (count/sum/min/max) due to disabled min/max index optimization.
- Solution: Pre-extract min/max index before parallel merge to limit iteration range.
- ClickHouse CI performance tests and differential flame graphs help identify small performance penalties.