https://github.com/helinwang/parquet-rs
parquet-rs that builds
https://github.com/helinwang/parquet-rs
Last synced: 27 days ago
JSON representation
parquet-rs that builds
- Host: GitHub
- URL: https://github.com/helinwang/parquet-rs
- Owner: helinwang
- Created: 2020-06-07T15:42:42.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2020-06-07T15:50:45.000Z (about 5 years ago)
- Last Synced: 2025-02-17T06:28:51.401Z (4 months ago)
- Language: Rust
- Size: 205 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# An Apache Parquet implementation in Rust
**All credit goes to https://crates.io/crates/parquet.** The crate
doesn't build, this repo only make the crate build successfully.## Usage
Add this to your Cargo.toml:
```toml
[dependencies]
parquet = "0.16.0"
```and this to your crate root:
```rust
extern crate parquet;
```Example usage of reading data:
```rust
use std::fs::File;
use std::path::Path;
use parquet::file::reader::{FileReader, SerializedFileReader};let file = File::open(&Path::new("/path/to/file")).unwrap();
let reader = SerializedFileReader::new(file).unwrap();
let mut iter = reader.get_row_iter(None).unwrap();
while let Some(record) = iter.next() {
println!("{}", record);
}
```
See [crate documentation](https://docs.rs/crate/parquet/0.16.0) on available API.## Supported Parquet Version
- Parquet-format 2.4.0To update Parquet format to a newer version, check if [parquet-format](https://github.com/sunchao/parquet-format-rs)
version is available. Then simply update version of `parquet-format` crate in Cargo.toml.## Features
- [X] All encodings supported
- [X] All compression codecs supported
- [X] Read support
- [X] Primitive column value readers
- [X] Row record reader
- [X] Arrow record reader
- [X] Statistics support
- [X] Write support
- [X] Primitive column value writers
- [ ] Row record writer
- [ ] Arrow record writer
- [ ] Predicate pushdown
- [ ] Parquet format 2.5 support## Requirements
- Rust nightlySee [Working with nightly Rust](https://github.com/rust-lang-nursery/rustup.rs/blob/master/README.md#working-with-nightly-rust)
to install nightly toolchain and set it as default.Parquet requires LLVM. Our windows CI image includes LLVM but to build the libraries locally windows
users will have to install LLVM. Follow [this](https://github.com/appveyor/ci/issues/2651) link for info.## Build
Run `cargo build` or `cargo build --release` to build in release mode.
Some features take advantage of SSE4.2 instructions, which can be
enabled by adding `RUSTFLAGS="-C target-feature=+sse4.2"` before the
`cargo build` command.## Test
Run `cargo test` for unit tests.## Binaries
The following binaries are provided (use `cargo install` to install them):
- **parquet-schema** for printing Parquet file schema and metadata.
`Usage: parquet-schema [verbose]`, where `file-path` is the path to a Parquet file,
and optional `verbose` is the boolean flag that allows to print full metadata or schema only
(when not specified only schema will be printed).- **parquet-read** for reading records from a Parquet file.
`Usage: parquet-read [num-records]`, where `file-path` is the path to a Parquet file,
and `num-records` is the number of records to read from a file (when not specified all records will
be printed).If you see `Library not loaded` error, please make sure `LD_LIBRARY_PATH` is set properly:
```
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$(rustc --print sysroot)/lib
```## Benchmarks
Run `cargo bench` for benchmarks.## Docs
To build documentation, run `cargo doc --no-deps`.
To compile and view in the browser, run `cargo doc --no-deps --open`.## License
Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0.