https://github.com/pwoolcoc/ngrams
(Read-only) Generate n-grams
https://github.com/pwoolcoc/ngrams
Last synced: about 2 months ago
JSON representation
(Read-only) Generate n-grams
- Host: GitHub
- URL: https://github.com/pwoolcoc/ngrams
- Owner: pwoolcoc
- License: apache-2.0
- Created: 2015-11-16T02:28:12.000Z (over 10 years ago)
- Default Branch: master
- Last Pushed: 2016-08-30T14:41:33.000Z (over 9 years ago)
- Last Synced: 2024-04-22T13:31:49.819Z (almost 2 years ago)
- Language: Rust
- Homepage: https://pwoolcoc.gitlab.io/ngrams/ngrams
- Size: 57.1 MB
- Stars: 27
- Watchers: 3
- Forks: 4
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE-APACHE
Awesome Lists containing this project
- awesome-rust-cn - pwoolcoc/ngrams - grams](https://en.wikipedia.org/wiki/N-gram) from arbitrary iterators [<img src="https://api.travis-ci.org/pwoolcoc/ngrams.svg?branch=master">](https://travis-ci.org/pwoolcoc/ngrams) (Libraries / Text processing)
- awesome-rust - pwoolcoc/ngrams - grams](https://en.wikipedia.org/wiki/N-gram) from arbitrary iterators [<img src="https://api.travis-ci.org/pwoolcoc/ngrams.svg?branch=master">](https://travis-ci.org/pwoolcoc/ngrams) (Libraries / Text processing)
- awesome-rust - pwoolcoc/ngrams - grams](https://en.wikipedia.org/wiki/N-gram) from arbitrary iterators (Libraries / Text processing)
- awesome-rust-cn - pwoolcoc/ngrams
- awesome-rust-zh - pwoolcoc/ngrams - 从任意迭代器,构造[n-grams](https://en.wikipedia.org/wiki/N-gram)[<img src="https://api.travis-ci.org/pwoolcoc/ngrams.svg?branch=master">](https://travis-ci.org/pwoolcoc/ngrams) (库 / 文本处理)
- fucking-awesome-rust - pwoolcoc/ngrams - Construct 🌎 [n-grams](en.wikipedia.org/wiki/N-gram) from arbitrary iterators (Libraries / Text processing)
- awesome-rust - pwoolcoc/ngrams - Construct [n-grams](https://en.wikipedia.org/wiki/N-gram) from arbitrary iterators (Libraries / Text processing)
- awesome-rust - pwoolcoc/ngrams - grams](https://en.wikipedia.org/wiki/N-gram) from arbitrary iterators [<img src="https://api.travis-ci.org/pwoolcoc/ngrams.svg?branch=master">](https://travis-ci.org/pwoolcoc/ngrams) (库 Libraries / 文本处理 Text processing)
- awesome-rust-with-stars - pwoolcoc/ngrams - grams from arbitrary iterators | 2016-08-30 | (Libraries / Text processing)
README
# N-grams
[](https://gitlab.com/pwoolcoc/ngrams)
[](https://coveralls.io/github/pwoolcoc/ngrams?branch=master)
[](https://crates.io/crates/ngrams)
[Documentation](https://pwoolcoc.gitlab.io/ngrams/ngrams)
This crate takes a sequence of tokens and generates an n-gram for it.
For more information about n-grams, check wikipedia: https://en.wikipedia.org/wiki/N-gram
*Note*: The canonical version of this crate is hosted on [Gitlab](https://gitlab.com/pwoolcoc/ngrams)
## Usage
Probably the easiest way to use it is to use the iterator adaptor. If
your tokens are strings (&str, String, char, or Vec), you don't have
to do anything other than generate the token stream:
```rust
use ngrams::Ngram;
let grams: Vec<_> = "one two three".split(' ').ngrams(2).collect();
// => vec![
// vec!["\u{2060}", "one"],
// vec!["one", "two"],
// vec!["two", "three"],
// vec!["three", "\u{2060}"],
// ]
```
(re: the "\u{2060}": We use the unicode `WORD JOINER` symbol as padding on the beginning and
end of the token stream.)
If your token type isn't one of the listed types, you can still use the
iterator adaptor by implementing the `ngram::Pad` trait for your type.