Datasetq: jq for Datasets; Polars-powered Parquet/JSON/CSV query lang/cli
4 days ago
- #data-processing
- #jq-syntax
- #Polars
- dsq is a high-performance data processing tool extending jq-like syntax to structured data formats.
- Supports multiple formats including Parquet, Avro, CSV, JSON Lines, Arrow with automatic detection.
- Built on Polars for fast data manipulation with lazy evaluation and efficient memory usage.
- Offers familiar jq-inspired filter syntax extended for tabular data operations.
- Provides correct type handling and clear error messages.
- Available for download on Linux, Mac, and Windows; can be installed via Rust toolchain.
- Features include format conversion, data aggregation, filtering, and transformation.
- Supports lazy evaluation for large datasets and includes an interactive REPL.
- Commands for data inspection, merging, and completions generation.
- Detailed documentation covers architecture, functions, formats, API, and configuration.
- Contributions are encouraged with a focus on compatibility, testing, and documentation.