dsq - jq for datasets
A high-performance command-line tool for querying and transforming structured data across multiple formats
Key Features
High Performance
Built on Polars DataFrames with lazy evaluation and columnar operations for lightning-fast data processing
Format Flexibility
Supports Parquet, Avro, CSV, JSON Lines, Arrow, and more with automatic format detection
User-Friendly
Intuitive jq-inspired syntax with interactive REPL mode and clear error messages
Supported Formats
CSV/TSV Parquet JSON/JSON Lines Arrow Avro ASCII Delimited Text
Perfect For
Data Analysts
Quick data exploration and transformation
Developers
Pipeline integration and data processing
Data Engineers
ETL workflows and format conversion
Researchers
Dataset analysis and manipulation
Get Started
Install via cargo:
$ cargo install dsq-cli
Or download pre-compiled binaries from GitHub
Transform data
# Input: employees.csv
# id,name,age,city,salary,department
# 1,Alice Johnson,28,New York,75000,Engineering
# 2,Bob Smith,34,Los Angeles,82000,Sales $ dsq 'map(.salary += 5000) | map({name, new_salary: .salary, department})' employees.csv [
{"department": "Engineering", "name": "Alice Johnson", "new_salary": 80000},
{"department": "Sales", "name": "Bob Smith", "new_salary": 87000},
...
] Group and aggregate
$ dsq 'group_by(.department) | map({dept: .[0].department, count: length, avg_salary: (map(.salary) | add / length)})' employees.csv [
{"avg_salary": 90666.67, "count": 3, "dept": "Engineering"},
{"avg_salary": 63500.0, "count": 2, "dept": "HR"},
...
] Convert formats
# CSV to Parquet
$ dsq '.' data.csv -o output.parquet
# JSON to CSV
$ dsq '.' data.jsonl -o output.csv
# Parquet to JSON
$ dsq '.' data.parquet -o output.json Cross-platform support for Linux, macOS, and Windows