https://github.com/loichyan/noodler
🍜 A port of python-ngram provides fuzzy search using N-gram
https://github.com/loichyan/noodler
ngrams rust search text-processing
Last synced: 7 months ago
JSON representation
🍜 A port of python-ngram provides fuzzy search using N-gram
- Host: GitHub
- URL: https://github.com/loichyan/noodler
- Owner: loichyan
- License: apache-2.0
- Created: 2023-04-04T08:24:39.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2023-04-11T11:31:23.000Z (over 2 years ago)
- Last Synced: 2025-02-28T06:54:50.564Z (8 months ago)
- Topics: ngrams, rust, search, text-processing
- Language: Rust
- Homepage: https://crates.io/crates/noodler
- Size: 23.4 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE-APACHE
Awesome Lists containing this project
README
# 🍜 Noodler
> In computer science, "noodler" is used to describe programs that handle text.
> Because algorithms like n-grams are typically used to extract information from
> text, similar to pulling strands of noodles out of a pile of dough, "noodler"
> can be associated with algorithms that extract information from text because
> they can be seen as "processing" programs for text, just as noodle makers
> "produce" noodles from dough.
>
> _ChatGPT_A port of the [python-ngram](https://github.com/gpoulter/python-ngram) project
that provides fuzzy search using [N-gram](https://en.wikipedia.org/wiki/N-gram).## ✍️ Example
```rust
use noodler::NGram;let ngram = NGram::<&str>::builder()
.arity(2)
.warp(3.0)
.threshold(0.75)
.build()
// Feed with known words
.fill(vec!["pie", "animal", "tomato", "seven", "carbon"]);// Try an unknown/misspelled word, and find a similar match
let word = "tomacco";
let top = ngram.search_sorted(word).next();
if let Some((text, similarity)) = top {
if similarity > 0.99 {
println!("✔ {}", text);
} else {
println!(
"❓{} (did you mean {}? [{:.0}% match])",
word,
text,
similarity * 100.0
);
}
} else {
println!("🗙 {}", word);
}
```## 💭 Inspired by
Please check out these awesome works that helped a lot in the creation of
noodler:- [python-ngram](https://github.com/gpoulter/python-ngram): Set that supports
searching by ngram similarity.
- [ngrammatic](https://github.com/compenguy/ngrammatic): A rust crate providing
fuzzy search/string matching using N-grams.## 🚩 Minimal supported Rust version
All tests passed with `rustc v1.41`, earlier versions may not compile.
## ⚖️ License
Licensed under either of
- Apache License, Version 2.0 ([LICENSE-APACHE](LICENSE-APACHE) or
)
- MIT license ([LICENSE-MIT](LICENSE-MIT) or
)at your option.