https://github.com/sunchao/parquet-rs
Apache Parquet implementation in Rust
https://github.com/sunchao/parquet-rs
hadoop parquet rust
Last synced: 3 months ago
JSON representation
Apache Parquet implementation in Rust
- Host: GitHub
- URL: https://github.com/sunchao/parquet-rs
- Owner: sunchao
- License: apache-2.0
- Archived: true
- Created: 2016-09-26T02:03:36.000Z (about 9 years ago)
- Default Branch: master
- Last Pushed: 2018-12-21T19:28:23.000Z (almost 7 years ago)
- Last Synced: 2024-11-25T09:45:04.835Z (11 months ago)
- Topics: hadoop, parquet, rust
- Language: Rust
- Homepage:
- Size: 4.44 MB
- Stars: 149
- Watchers: 9
- Forks: 20
- Open Issues: 30
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# parquet-rs
[](https://travis-ci.org/sunchao/parquet-rs)
[](https://coveralls.io/github/sunchao/parquet-rs?branch=master)
[](https://opensource.org/licenses/Apache-2.0)
[](https://crates.io/crates/parquet)
[](https://sunchao.github.io/parquet-rs/0.4.2/parquet/)
[](https://sunchao.github.io/parquet-rs/master/parquet/)An [Apache Parquet](https://parquet.apache.org/) implementation in Rust.
**NOTE: this project has merged into [Apache Arrow](https://arrow.apache.org/), and development will continue there.** **To file an issue or pull request, please file a [JIRA](https://issues.apache.org/jira/projects/ARROW/issues) in the Arrow project.**
## Usage
Add this to your Cargo.toml:
```toml
[dependencies]
parquet = "0.4"
```and this to your crate root:
```rust
extern crate parquet;
```Example usage of reading data:
```rust
use std::fs::File;
use std::path::Path;
use parquet::file::reader::{FileReader, SerializedFileReader};let file = File::open(&Path::new("/path/to/file")).unwrap();
let reader = SerializedFileReader::new(file).unwrap();
let mut iter = reader.get_row_iter(None).unwrap();
while let Some(record) = iter.next() {
println!("{}", record);
}
```
See [crate documentation](https://sunchao.github.io/parquet-rs/master) on available API.## Supported Parquet Version
- Parquet-format 2.4.0To update Parquet format to a newer version, check if [parquet-format](https://github.com/sunchao/parquet-format-rs)
version is available. Then simply update version of `parquet-format` crate in Cargo.toml.## Features
- [X] All encodings supported
- [X] All compression codecs supported
- [X] Read support
- [X] Primitive column value readers
- [X] Row record reader
- [ ] Arrow record reader
- [X] Statistics support
- [X] Write support
- [X] Primitive column value writers
- [ ] Row record writer
- [ ] Arrow record writer
- [ ] Predicate pushdown
- [ ] Parquet format 2.5 support
- [ ] HDFS support## Requirements
- Rust nightlySee [Working with nightly Rust](https://github.com/rust-lang-nursery/rustup.rs/blob/master/README.md#working-with-nightly-rust)
to install nightly toolchain and set it as default.## Build
Run `cargo build` or `cargo build --release` to build in release mode.
Some features take advantage of SSE4.2 instructions, which can be
enabled by adding `RUSTFLAGS="-C target-feature=+sse4.2"` before the
`cargo build` command.## Test
Run `cargo test` for unit tests.## Binaries
The following binaries are provided (use `cargo install` to install them):
- **parquet-schema** for printing Parquet file schema and metadata.
`Usage: parquet-schema [verbose]`, where `file-path` is the path to a Parquet file,
and optional `verbose` is the boolean flag that allows to print full metadata or schema only
(when not specified only schema will be printed).- **parquet-read** for reading records from a Parquet file.
`Usage: parquet-read [num-records]`, where `file-path` is the path to a Parquet file,
and `num-records` is the number of records to read from a file (when not specified all records will
be printed).If you see `Library not loaded` error, please make sure `LD_LIBRARY_PATH` is set properly:
```
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$(rustc --print sysroot)/lib
```## Benchmarks
Run `cargo bench` for benchmarks.## Docs
To build documentation, run `cargo doc --no-deps`.
To compile and view in the browser, run `cargo doc --no-deps --open`.## License
Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0.