https://github.com/vschwaberow/draxu
draxu is a cross-platform C++23 command-line utility that reshapes semi-structured data with predictable resource usage. Release 0.1.0 focuses on fast, deterministic transformations for automation pipelines that ingest JSON, NDJSON, CSV, and TSV sources and emit JSON arrays, NDJSON lines, or CSV tables.
https://github.com/vschwaberow/draxu
cpp23 csv json
Last synced: 6 months ago
JSON representation
draxu is a cross-platform C++23 command-line utility that reshapes semi-structured data with predictable resource usage. Release 0.1.0 focuses on fast, deterministic transformations for automation pipelines that ingest JSON, NDJSON, CSV, and TSV sources and emit JSON arrays, NDJSON lines, or CSV tables.
- Host: GitHub
- URL: https://github.com/vschwaberow/draxu
- Owner: vschwaberow
- License: mit
- Created: 2025-09-19T06:12:27.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2025-09-19T06:15:24.000Z (7 months ago)
- Last Synced: 2025-09-19T08:29:00.966Z (7 months ago)
- Topics: cpp23, csv, json
- Language: C++
- Homepage:
- Size: 24.4 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
- Security: SECURITY.md
Awesome Lists containing this project
README
# draxu

[](https://github.com/volkerschwaberow/draxu/actions/workflows/ci.yml)
[](LICENSE)
## Overview
"draxu" is a cross-platform C++23 command-line utility that reshapes semi-structured data with predictable resource usage. Release 0.1.0 focuses on fast, deterministic transformations for automation pipelines that ingest JSON, NDJSON, CSV, and TSV sources and emit JSON arrays, NDJSON lines, or CSV tables.
## Table of Contents
1. [Highlights](#highlights)
2. [Architecture](#architecture)
3. [Quick Start](#quick-start)
4. [Commands and Expressions](#commands-and-expressions)
5. [Runtime Limits](#runtime-limits)
6. [Testing](#testing)
7. [Documentation](#documentation)
8. [Security](#security)
9. [Roadmap](#roadmap)
10. [Contributing](#contributing)
11. [License](#license)
## Highlights
- Deterministic command surface: `select` for dot-path projections, `convert` for format translation.
- Streaming readers for JSON, NDJSON, CSV, and TSV backed by simdjson and fast-cpp-csv-parser.
- Writers for JSON arrays, NDJSON lines, and CSV tables with consistent quoting and flushing semantics.
- All literals and diagnostics are defined once in `src/consts.hh`, emitted via UTF-8 aware utilities.
- Guardrails for max bytes, items, and JSON depth to keep workloads safe by default across Linux, macOS, and Windows.
## Architecture
```
┌──────────┐ ┌────────────┐ ┌──────────┐ ┌────────────┐
│ cli │ → │ pipeline │ → │ writer │ → │ destinations│
└──────────┘ └────────────┘ └──────────┘ └────────────┘
│ │ ▲
│ │ │
▼ ▼ │
limits/io readers (json, csv) │
│ │ │
└───── expr engine (dot/index) ──┘
```
Modules live in `src/`, tests in `tests/`, reusable CMake configuration in `cmake/`, and workflow automation under `.github/workflows/`.
## Quick Start
```bash
cmake -S . -B build -DDRAXU_BUILD_TESTS=ON -DDRAXU_WARN_AS_ERRORS=ON -DCMAKE_BUILD_TYPE=Release
cmake --build build --config Release
./build/draxu convert --in-format csv --out-format json --input data/sample.csv --output data/sample.json
```
Use `-` for stdin/stdout streaming. Install with `cmake --install build --config Release` if you want a system-level binary.
## Commands and Expressions
- `--version`: Print tool name, version (0.1.0), and author information.
- `convert`: Pass records through while changing the container format.
- `select`: Apply an expression per record and emit the resulting scalar.
The CLI operates on UTF-8 text end-to-end; all parameters and outputs accept and emit
Unicode code units encoded as `char8_t` where applicable.
Expression syntax:
- JSON fields: `.name`, `.user.email`.
- JSON arrays: `.[0]`, `.[2]`.
- CSV/TSV columns: `.[0]`, `.[1]`.
- Identity: omit `--expr` or pass `.`.
Supported formats:
- Inputs: `json`, `ndjson`, `csv`, `tsv`.
- Outputs: `json`, `ndjson`, `csv`.
## Runtime Limits
Defaults favour safety:
- `--max-bytes 8388608`
- `--max-items 1000000`
- `--max-depth 64`
Exit codes communicate status: `0` success, `2` input error, `4` limit reached, `5` system failure.
## Testing
```bash
cmake --build build --target draxu_tests --config Release
ctest --test-dir build --output-on-failure --build-config Release
```
GitHub Actions (`CI`) runs a three-platform matrix (Linux, macOS, Windows) and a dedicated sanitizers job for AddressSanitizer and UndefinedBehaviorSanitizer on Linux.
## Documentation
- [User guide](docs/USER_GUIDE.md)
- [Security policy](SECURITY.md)
- [Changelog](CHANGELOG.md)
## Security
The project ships hardened compiler flags (`-Wall -Wextra -fstack-protector-strong -D_FORTIFY_SOURCE=2`) and enforces sanitizer runs in CI. Please review [SECURITY.md](SECURITY.md) for reporting instructions and threat model details.
## Roadmap
Release 0.2.0 is planned to add sorting, unique filtering, streaming joins, YAML and XML readers, and optional JSON schema validation. Contributions and proposals are welcome via issues or pull requests.
## Contributing
- Follow Google C++ style with the bundled `.clang-format` and `.clang-tidy` settings.
- Place command and expression examples alongside new features in docs and tests.
- Run the full test matrix before submitting a pull request; include sanitizer results for complex changes.
## License
Distributed under the MIT License. See [LICENSE](LICENSE).