TreeStore: Endowing Your Data with Hierarchical Structure
14 days ago
- #data-storage
- #hierarchical-data
- #compression
- TreeStore is a feature in the blosc2 library for organizing compressed arrays hierarchically, similar to a filesystem.
- It supports storing blosc2.NDArray or blosc2.SChunk objects along with metadata, saved with a .b2z extension.
- Basic usage involves creating and populating a TreeStore with datasets and groups using path-like syntax.
- Reading from a TreeStore is done in read mode, accessing datasets and metadata with the same path-like keys.
- Advanced features include attaching variable-length metadata (vlmeta) to groups or the root and working with subtrees.
- TreeStore supports iterating through nodes to inspect contents, distinguishing between datasets and groups.
- Benchmarks show TreeStore is efficient in storage and performance, especially for large datasets, compared to HDF5 and Zarr.
- TreeStore is currently in beta, with feedback and suggestions welcomed for further improvements.