An open API service indexing awesome lists of open source software.

https://github.com/harryscholes/kmer-minhash

A library for parallel k-mer min-wise hashing
https://github.com/harryscholes/kmer-minhash

Last synced: 3 months ago
JSON representation

A library for parallel k-mer min-wise hashing

Awesome Lists containing this project

README

        

# kmer-minhash

Parallel _k_-mer min-wise hashing in Rust.

This package provides:
- `Kmers` struct for representing the _k_-mers of some sequence
- `KmersIntoIter` struct for iterating over the _k_-mers
- `MinHash` trait for min-wise hashing of _k_-mers

### Example

```rust
use std::collections::hash_map::DefaultHasher;
use std::hash::{Hash, Hasher};

use kmer_minhash::{Kmers, MinHash};

const KMER_SIZE: usize = 3;
const N_HASHES: usize = 2;

fn main() {
let seq = "abcde";

let kmers = Kmers::from_str(seq, KMER_SIZE);

let min_hashes = kmers.min_hash(N_HASHES).unwrap();

// Check that the minhashes are correct:
let mut manual_hashes = vec!["abc", "bcd", "cde"]
.iter()
.map(|kmer| {
let mut hasher = DefaultHasher::new();
kmer.hash(&mut hasher);
hasher.finish()
})
.collect::>();

manual_hashes.sort();

assert_eq!(min_hashes, manual_hashes[..N_HASHES]);
}
```

See the tests in `src/lib.rs` for more examples of how to use this package with Rayon and Tokio.