Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/domoritz/csv2arrow
Convert CSV files to Apache Arrow.
https://github.com/domoritz/csv2arrow
arrow rust
Last synced: 3 months ago
JSON representation
Convert CSV files to Apache Arrow.
- Host: GitHub
- URL: https://github.com/domoritz/csv2arrow
- Owner: domoritz
- License: apache-2.0
- Archived: true
- Created: 2021-02-26T19:06:16.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2023-02-02T18:24:05.000Z (almost 2 years ago)
- Last Synced: 2024-06-11T12:47:11.615Z (5 months ago)
- Topics: arrow, rust
- Language: Rust
- Homepage:
- Size: 135 KB
- Stars: 16
- Watchers: 5
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: Readme.md
- Contributing: CONTRIBUTING.md
- License: LICENSE_APACHE.txt
Awesome Lists containing this project
- jimsghstars - domoritz/csv2arrow - Convert CSV files to Apache Arrow. (Rust)
README
# CSV to Arrow
**This repo is archived and the code moved to [Arrow CLI Tools](https://github.com/domoritz/arrow-tools).**
[![Crates.io](https://img.shields.io/crates/v/csv2arrow.svg)](https://crates.io/crates/csv2arrow)
[![Rust](https://github.com/domoritz/csv2arrow/actions/workflows/rust.yml/badge.svg)](https://github.com/domoritz/csv2arrow/actions/workflows/rust.yml)Convert CSV files to Apache Arrow. You may also be interested in [json2arrow](https://github.com/domoritz/json2arrow), [csv2parquet](https://github.com/domoritz/csv2parquet), or [json2parquet](https://github.com/domoritz/json2parquet).
## Installation
### Download prebuilt binaries
You can get the latest releases from https://github.com/domoritz/csv2arrow/releases/.
### With Cargo
```
cargo install csv2arrow
```## Usage
```
Usage: csv2arrow [OPTIONS] [ARROW]Arguments:
Input CSV file
[ARROW] Output file, stdout if not presentOptions:
-s, --schema-file
File with Arrow schema in JSON format
-m, --max-read-records
The number of records to infer the schema from. All rows if not present. Setting max-read-records to zero will stop schema inference and all columns will be string typed
--header
Set whether the CSV file has headers [possible values: true, false]
-d, --delimiter
Set the CSV file's column delimiter as a byte character [default: ,]
-p, --print-schema
Print the schema to stderr
-n, --dry
Only print the schema
-h, --help
Print help information
-V, --version
Print version information
```The --schema-file option uses the same file format as --dry and --print-schema.