Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/nvictus/bow
CLI for Parquet <-> CSV interconversion
https://github.com/nvictus/bow
Last synced: 24 days ago
JSON representation
CLI for Parquet <-> CSV interconversion
- Host: GitHub
- URL: https://github.com/nvictus/bow
- Owner: nvictus
- License: mit
- Created: 2020-04-11T17:16:55.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2020-05-25T18:31:55.000Z (over 4 years ago)
- Last Synced: 2024-10-03T12:17:47.540Z (3 months ago)
- Language: Python
- Homepage:
- Size: 66.4 KB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# bow
A simple CLI Parquet file reader (parquet -> csv) and writer (csv -> parquet) using `pyarrow`. (WIP!)
_Most importantly, it works in chunks (row groups) so you can actually stream and write big files._
I don't understand why such a thing doesn't exist yet, so here you go.
No, I'm not going to use the Java `parquet-mr` CLI. Go hadoop yourself.
Maybe I'll include some arrow streaming functionality at some point.
```
Usage: bow [OPTIONS] COMMAND [ARGS]...Options:
-V, --version Show the version and exit.
-h, --help Show this message and exit.Commands:
info Print Parquet file metadata.
par2txt Convert Parquet to CSV text.
txt2par Convert CSV text to Parquet.
```