Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/huy-dna/ruzzy_rs
A versatile and flexible fuzzy matcher in rust based on Levenshtein Distance
https://github.com/huy-dna/ruzzy_rs
Last synced: 22 days ago
JSON representation
A versatile and flexible fuzzy matcher in rust based on Levenshtein Distance
- Host: GitHub
- URL: https://github.com/huy-dna/ruzzy_rs
- Owner: Huy-DNA
- License: apache-2.0
- Created: 2024-06-04T12:50:38.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2024-06-06T23:19:15.000Z (5 months ago)
- Last Synced: 2024-08-08T17:16:57.920Z (3 months ago)
- Language: Rust
- Size: 17.6 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# ruzzy_rs
[![Crates.io][crates-badge]][crates-url][crates-badge]: https://img.shields.io/crates/v/ruzzy.svg
[crates-url]: https://crates.io/crates/ruzzyA versatile and flexible fuzzy matcher in rust based on Levenshtein Distance
## Installation
```bash
cargo add ruzzy
```## Usage
This crate performs fuzzy matching based on the Levenshtein distance a.k.a the edit distance. It means that the less string edits it take to transform string `A` to string `B`, the more similar `A` and `B`.
### `fuzzy_match`
The only function that this crate exposes is:
```rust
fn fuzzy_match<'a, Value: 'a>(needle: &'a String, haystack: &'a Vec<(String, Value)>, config: FuzzyConfig) -> Option<&'a Value>;
```where:
* `needle` is the string to be matched.
* `haystack` is the set of key-value and the key part is what is being matched against `needle`
* `config` allows you to tune the matching process.This function returns an `Option` that may wraps the corresponding value of the most similar key.
### `FuzzyConfig`
`FuzzyConfig` allows you to tune the matching process. Currently, these configurations are supported:
* `threshold`: If the edit distance is higher than this `threshold`, the key in the `haystack` is unacceptable and is not considered a match.
* `insertion_penalty`: The cost of a character insertion in the `needle` (by default: `1`).
* `deletion_penalty`: The cost of a character deletion in the `needle` (by default: `1`).
* `substitution_penalty`: The cost of a character substition (by default: `2`).