Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jqnatividad/qsv-sniffer
CSV sniffer crate for Rust, optimized for qsv
https://github.com/jqnatividad/qsv-sniffer
ckan csv qsv
Last synced: 2 months ago
JSON representation
CSV sniffer crate for Rust, optimized for qsv
- Host: GitHub
- URL: https://github.com/jqnatividad/qsv-sniffer
- Owner: jqnatividad
- License: mit
- Created: 2022-03-20T14:09:36.000Z (almost 3 years ago)
- Default Branch: master
- Last Pushed: 2024-04-05T13:51:27.000Z (9 months ago)
- Last Synced: 2024-10-13T13:42:16.862Z (3 months ago)
- Topics: ckan, csv, qsv
- Language: Rust
- Homepage:
- Size: 455 KB
- Stars: 7
- Watchers: 1
- Forks: 0
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# qsv CSV sniffer
[![Documentation](https://docs.rs/qsv-sniffer/badge.svg)](https://docs.rs/qsv-sniffer)
`qsv-sniffer` provides methods to infer CSV file metadata (delimiter choice, quote character,
number of fields, field names, field data types, etc.). See the documentation for more details.Its a detached fork of [csv-sniffer](https://github.com/jblondin/csv-sniffer) with these additional capabilities, detecting:
* utf-8 encoding
* field names
* number of rows
* average record length
* additional data types - Date/DateTime and NULL
* smarter Boolean type detection - "true" and "false" are not the only Boolean values it detects. It now also detects
1/0, yes/no, y/n, true/false, t/f - case insensitive> ℹ️ **NOTE:** This fork is optimized to support [qsv](https://github.com/jqnatividad/qsv), and its development
will be primarily dictated by qsv's requirements.# Setup
## As a Command-line application
```
cargo install qsv-sniffer
```This will install a binary named `sniff`.
## As a Library
Add this to your `Cargo.toml`:
```toml
[dependencies]
qsv-sniffer = "0.9"
```and this to your crate root:
```rust
use qsv_sniffer;
```## Feature flags
* `cli` - to build the `sniff` binary
* `runtime-dispatch-simd` - enables detection of SIMD capabilities at runtime, which allows using the
SSE2 and AVX2 code paths (only works on Intel and AMD architectures. Ignored on other architectures).
* `generic-simd` - enables architecture-agnostic SIMD capabilities, but only works with Rust nightly.The SIMD features are mutually exclusive and increase sampling performance.
# Example
This example shows how to write a simple command-line tool for discovering the metadata of a CSV
file:```no_run
use qsv_sniffer;use std::env;
fn main() {
let args: Vec = env::args().collect();
if args.len() != 2 {
eprintln!("Usage: {} ", args[0]);
::std::process::exit(1);
}// sniff the path provided by the first argument
match qsv_sniffer::Sniffer::new().sniff_path(&args[1]) {
Ok(metadata) => {
println!("{}", metadata);
},
Err(err) => {
eprintln!("ERROR: {}", err);
}
}
}
```This example is provided as the primary binary for this crate. In the source directory, this can be
run as:```ignore
$ cargo run -- tests/data/library-visitors.csv
```