https://github.com/vahid110/sqlxport
sql2parquet: A modern CLI tool to export SQL query results from PostgreSQL or Amazon Redshift directly to Parquet files, with optional upload to S3 or MinIO.
https://github.com/vahid110/sqlxport
athe aws csv data-export delta-lake duckdb minio parquet redshift s3 sql
Last synced: 4 months ago
JSON representation
sql2parquet: A modern CLI tool to export SQL query results from PostgreSQL or Amazon Redshift directly to Parquet files, with optional upload to S3 or MinIO.
- Host: GitHub
- URL: https://github.com/vahid110/sqlxport
- Owner: vahid110
- License: mit
- Created: 2025-05-26T10:45:43.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2025-06-29T19:46:06.000Z (6 months ago)
- Last Synced: 2025-07-31T09:30:22.885Z (5 months ago)
- Topics: athe, aws, csv, data-export, delta-lake, duckdb, minio, parquet, redshift, s3, sql
- Language: Python
- Homepage:
- Size: 901 KB
- Stars: 14
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
[](https://pypi.org/project/sqlxport/)

[](https://github.com/psf/black)
# sqlxport
**Modular CLI + API tool to extract data from PostgreSQL, Redshift, SQLite (and more), exporting to formats like Parquet/CSV, with optional S3 upload and Athena integration.**
---
## β
Features
* π Run custom SQL queries against PostgreSQL, Redshift, SQLite
* π¦ Export to Parquet or CSV (`--format`)
* π©£ Upload results to S3 or MinIO
* π Redshift `UNLOAD` support (`--export-mode redshift-unload`)
* π§Ή Partition output by column
* π Generate Athena `CREATE TABLE` DDL
* π Preview local or remote Parquet/CSV files
* βοΈ `.env` support for convenient config
* π Reusable Python API
---
## β Why SQLxport?
SQLxport simplifies data export workflows and is designed for automation:
* β
One command gives you SQL β Parquet/CSV β S3
* π§± Works locally, in CI, or inside Docker
* πͺ’ Connects to Athena, MinIO, Redshift easily
* π Clean format and database plugin model
* π§ͺ Fully tested, scriptable, production-ready
---
## π¦ Installation
```bash
pip install .
# or for development
pip install -e .
```
---
## π Usage
### Choose Export Mode
| `--export-mode` | Compatible DB URLs | Description |
|------------------------|----------------------------------|--------------------------|
| `postgres-query` | `postgresql://`, `postgres://` | SELECT + local export |
| `redshift-unload` | `redshift://` | UNLOAD to S3 |
| `sqlite-query` | `sqlite:///path.db` | For local/lightweight testing |
---
### CLI Examples
#### Basic Export
```bash
sqlxport run \
--export-mode postgres-query \
--db-url postgresql://user:pass@localhost:5432/mydb \
--query "SELECT * FROM users" \
--output-file users.parquet \
--format parquet
```
#### S3 Upload
```bash
sqlxport run \
--export-mode postgres-query \
--db-url postgresql://... \
--query "..." \
--output-file users.parquet \
--s3-bucket my-bucket \
--s3-key users.parquet \
--s3-access-key AKIA... \
--s3-secret-key ... \
--s3-endpoint https://s3.amazonaws.com
```
#### Partitioned Export
```bash
sqlxport run \
--export-mode postgres-query \
--db-url postgresql://... \
--query "..." \
--output-dir output/ \
--partition-by group_column \
--format csv
```
#### Redshift UNLOAD Mode
```bash
sqlxport run \
--export-mode redshift-unload \
--db-url redshift://... \
--query "SELECT * FROM large_table" \
--s3-output-prefix s3://bucket/unload/ \
--iam-role arn:aws:iam::123456789012:role/MyUnloadRole
```
---
## π Python API
```python
from sqlxport.api.export import run_export, ExportJobConfig
config = ExportJobConfig(
db_url="sqlite:///test.db",
query="SELECT * FROM users",
format="csv",
output_file="out.csv",
export_mode="sqlite-query"
)
run_export(config)
```
---
## π§ͺ Running Tests
```bash
pytest tests/unit/
pytest tests/integration/
pytest tests/e2e/
```
---
## π§ Environment Variables
Supports `.env` or exported shell variables:
```env
DB_URL=postgresql://username:password@localhost:5432/mydb
S3_BUCKET=my-bucket
S3_KEY=data/users.parquet
S3_ACCESS_KEY=...
S3_SECRET_KEY=...
S3_ENDPOINT=https://s3.amazonaws.com
IAM_ROLE=arn:aws:iam::123456789012:role/MyUnloadRole
```
Generate a template with:
```bash
sqlxport run --generate-env-template
```
---
## π Roadmap
* β
Modular export modes
* β
CSV and partitioned output
* β³ Add `jsonl`, `xlsx` formats
* β³ Plugin system for writers/loaders
* β³ SaaS mode / UI platform
* β³ Kafka/Kinesis streaming support
---
## π Security
* Donβt commit `.env` files
* Use credential vaults when possible
---
## π¨βπΌ Author
Vahid Saber
Built with β€οΈ for data engineers and developers.
---
## π License
MIT License