Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/orottier/rust-warc
A high performance and easy to use Web Archive (WARC) file reader
https://github.com/orottier/rust-warc
parser rust warc
Last synced: 2 days ago
JSON representation
A high performance and easy to use Web Archive (WARC) file reader
- Host: GitHub
- URL: https://github.com/orottier/rust-warc
- Owner: orottier
- Created: 2019-05-13T12:48:29.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2019-05-13T13:07:11.000Z (over 5 years ago)
- Last Synced: 2024-10-11T13:37:55.075Z (about 1 month ago)
- Topics: parser, rust, warc
- Language: Rust
- Size: 10.7 KB
- Stars: 9
- Watchers: 4
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
Rust-Warc
=========[![crates.io](https://img.shields.io/crates/v/rust_warc.svg)](https://crates.io/crates/rust_warc)
A high performance and easy to use Web Archive (WARC) file reader
```rust
use rust_warc::WarcReader;use std::io;
fn main() {
// we're taking input from stdin here, but any BufRead will do
let stdin = io::stdin();
let handle = stdin.lock();let warc = WarcReader::new(handle);
let mut response_counter = 0;
let mut response_size = 0;for item in warc {
let record = item.unwrap(); // could be IO/malformed error// header names are case insensitive
if record.header.get(&"WARC-Type".into()) == Some(&"response".into()) {
response_counter += 1;
response_size += record.content.len();
}
}println!("response records: {}", response_counter);
println!("response size: {} MiB", response_size >> 20);
}
```