https://github.com/nicopon/dtpipe
A simple, self-contained CLI for performance-focused data streaming & anonymization.
https://github.com/nicopon/dtpipe
cli csv data-masking database dotnet duckdb etl oracle parquet postgresql sql-server sqlite
Last synced: 1 day ago
JSON representation
A simple, self-contained CLI for performance-focused data streaming & anonymization.
- Host: GitHub
- URL: https://github.com/nicopon/dtpipe
- Owner: nicopon
- License: mit
- Created: 2026-01-16T09:18:57.000Z (20 days ago)
- Default Branch: main
- Last Pushed: 2026-01-30T19:04:44.000Z (6 days ago)
- Last Synced: 2026-01-31T11:32:43.659Z (5 days ago)
- Topics: cli, csv, data-masking, database, dotnet, duckdb, etl, oracle, parquet, postgresql, sql-server, sqlite
- Language: C#
- Homepage:
- Size: 569 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# DtPipe
[](https://github.com/codespaces/new?repo=nicopon/DtPipe)
**A simple, self-contained CLI for performance-focused data streaming & anonymization.**
DtPipe streams data from any source (SQL, CSV, Parquet) to any destination, applying intelligent transformations on the fly. It is designed for CI/CD pipelines, test data generation, and large dataset migration.
---
### 🚀 [**See the COOKBOOK for Recipes & Examples**](./COOKBOOK.md) 🍳
*Go here for Anonymization guides, Pipeline examples, and detailed tutorials.*
---
## Capabilities
- **Streaming Architecture**: Handles millions of rows with constant, low memory usage.
- **Multi-Provider**: Native support for **Oracle**, **SQL Server**, **PostgreSQL**, **DuckDB**, **SQLite**, **Parquet**, and **CSV**.
- **Zero Dependencies**: Single static binary. No drivers to install.
- **Anonymization Engine**: Built-in **Bogus** integration to fake Names, Emails, IBANs, and more.
- **Pipeline Transformation**: Mask, Nullify, Format, or Script (JS) data during export.
- **Production Ready**: YAML job configuration, Environment variable support, and robust logging.
## Installation
### Build from Source
**Prerequisite:** [.NET 10 SDK](https://dotnet.microsoft.com/download/dotnet/10.0) is required to compile.
```bash
# Bash (Mac/Linux/Windows Git Bash)
./build.sh
# PowerShell (Windows/Cross-platform)
./build.ps1
```
Binary created at: `./dist/release/dtpipe`
> **Note:** The pre-compiled binaries in [GitHub Releases](https://github.com/nicopon/DtPipe/releases) are **self-contained**. You do NOT need to install .NET to run them.
## Quick Reference
### CLI Usage
```bash
./dtpipe --input [SOURCE] --query [SQL] --output [DEST] [OPTIONS]
```
### 1. Connection Strings (Input & Output)
DtPipe auto-detects providers from file extensions (`.csv`, `.parquet`, `.db`, `.sqlite`) or explicit prefixes.
| Provider | Prefix / Format | Example |
|:---|:---|:---|
| **DuckDB** | `duck:` | `duck:my.db` |
| **SQLite** | `sqlite:` | `sqlite:data.sqlite` |
| **PostgreSQL**| `pg:` | `pg:Host=localhost;Database=mydb` |
| **Oracle** | `ora:` | `ora:Data Source=PROD;User Id=scott` |
| **SQL Server**| `mssql:` | `mssql:Server=.;Database=mydb` |
| **CSV** | `csv:` / `.csv` | `data.csv` |
| **Parquet** | `parquet:` / `.parquet`| `data.parquet` |
| **STDIN/OUT** | `csv` or `parquet` | `csv` (no file path) |
### 2. Anonymization & Fakers
Use `--fake "Col:Generator"` to replace sensitive data.
*See [COOKBOOK.md](./COOKBOOK.md#anonymization-the-fakers) for more examples.*
| Category | Key Generators |
|:---|:---|
| **Identity** | `name.fullName`, `name.firstName`, `internet.email` |
| **Address** | `address.fullAddress`, `address.city`, `address.zipCode` |
| **Finance** | `finance.iban`, `finance.creditCardNumber` |
| **Phone** | `phone.phoneNumber` |
| **Dates** | `date.past`, `date.future`, `date.recent` |
| **System** | `random.uuid`, `random.number`, `random.boolean` |
> Use `--fake-list` to print all available generators.
### 3. CLI Options Reference
#### Core
| Flag | Description |
|:---|:---|
| `-i`, `--input` | **Required**. Source connection string or file path. |
| `-q`, `--query` | **Required** (for queryable sources). SQL statement. |
| `-o`, `--output`| **Required**. Target connection string or file path. |
| `--limit` | Stop after N rows. |
| `--batch-size` | Rows per buffer (default: 50,000). |
| `--dry-run` | Preview data, **validate constraints**, and check schema compatibility. |
| `--key` | Comma-separated Primary Keys for Upsert/Ignore. Auto-detected from target if omitted. |
#### Automation
| Flag | Description |
|:---|:---|
| `--job [FILE]` | Execute a YAML job file. |
| `--export-job` | Save current CLI args as a YAML job. |
| `--log [FILE]` | Write execution statistics to file (Optional). |
#### Transformation Pipeline
| Flag | Description |
|:---|:---|
| `--fake "[Col]:[Method]"` | Generate fake data. |
| `--mask "[Col]:[Pattern]"` | Mask chars (`#` keeps char, others replace). |
| `--null "[Col]"` | Force column to NULL. |
| `--overwrite "[Col]:[Val]"`| Set column to fixed value. |
| `--format "[Col]:[Fmt]"` | Apply .NET format string. |
| `--script "[Col]:[JS]"` | Apply Javascript logic. |
| `--project`, `--drop` | Whitelist or Blacklist columns. |
#### Pipeline Modifiers
| Flag | Description |
|:---|:---|
| `--fake-locale [LOC]` | Locale for fakers (e.g. `fr`, `en_US`). |
| `--fake-seed-column [COL]`| Make faking deterministic based on a column value. |
| `--[type]-skip-null` | Skip transformation if value is NULL. |
#### Database Writer Options
| Flag | Description |
|:---|:---|
| `--ora-strategy` | `Append`, `Truncate`, `DeleteThenInsert`, `Recreate`, `Upsert`, `Ignore`. |
| `--ora-insert-mode` | `Standard`, `Append` (Direct-Path), `Bulk`. |
| `--pg-strategy` | `Append`, `Truncate`, `DeleteThenInsert`, `Recreate`, `Upsert`, `Ignore`. |
| `--pg-insert-mode` | `Standard`, `Bulk` (Binary Copy). |
| `--mssql-strategy` | `Append`, `Truncate`, `DeleteThenInsert`, `Recreate`, `Upsert`, `Ignore`. |
| `--duck-strategy` | `Append`, `Truncate`, `DeleteThenInsert`, `Recreate`, `Upsert`, `Ignore`. |
| `--sqlite-strategy` | `Append`, `DeleteThenInsert`, `Recreate`, `Upsert`, `Ignore`. |
---
## Contributing
Want to add a new database adapter or a custom transformer? Check out the [Developer Guide](./EXTENDING.md).
## License
MIT