Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/fredwu/stemmer
An English (Porter2) stemming implementation in Elixir.
https://github.com/fredwu/stemmer
bayes porter stemmer
Last synced: about 1 month ago
JSON representation
An English (Porter2) stemming implementation in Elixir.
- Host: GitHub
- URL: https://github.com/fredwu/stemmer
- Owner: fredwu
- Created: 2016-07-18T13:42:33.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2024-01-14T01:49:21.000Z (11 months ago)
- Last Synced: 2024-08-09T05:47:14.446Z (4 months ago)
- Topics: bayes, porter, stemmer
- Language: Elixir
- Size: 199 KB
- Stars: 150
- Watchers: 4
- Forks: 9
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- freaking_awesome_elixir - Elixir - An English (Porter2) stemming implementation in Elixir. (Text and Numbers)
- fucking-awesome-elixir - stemmer - An English (Porter2) stemming implementation in Elixir. (Text and Numbers)
- awesome-elixir - stemmer - An English (Porter2) stemming implementation in Elixir. (Text and Numbers)
README
# Stemmer [![Travis](https://img.shields.io/travis/fredwu/stemmer.svg)](https://travis-ci.org/fredwu/stemmer) [![Coverage](https://img.shields.io/coveralls/fredwu/stemmer.svg)](https://coveralls.io/github/fredwu/stemmer?branch=master) [![Hex.pm](https://img.shields.io/hexpm/v/stemmer.svg)](https://hex.pm/packages/stemmer)
An English ([Porter2](http://snowballstem.org/algorithms/english/stemmer.html)) stemming implementation in Elixir.
> In linguistic morphology and information retrieval, __stemming__ is the process of reducing inflected (or sometimes derived) words to their word stem, base or root form—generally a written word form. The stem need not be identical to the morphological root of the word; it is usually sufficient that related words map to the same stem, even if this stem is not in itself a valid root. - [Wikipedia](https://en.wikipedia.org/wiki/Stemming)
## Usage
The `Stemmer.stem/1` function supports stemming a single word (`String`), a sentence (`String`) or a list of single words (`List` of `String`s).
```elixir
Stemmer.stem("capabilities") # => "capabl"
Stemmer.stem("extraordinary capabilities") # => "extraordinari capabl"
Stemmer.stem(["extraordinary", "capabilities"]) # => ["extraordinari", "capabl"]
```## Compatibility
Stemmer is 100% compatible with the official Porter2 implementation, it is tested against the official [`diffs.txt`](http://snowball.tartarus.org/algorithms/english/diffs.txt) which contains more than 29000 words.
## Naive Bayes
Stemmer was built to support the [Simple Bayes](https://github.com/fredwu/simple_bayes) library. :heart:
## License
Licensed under [MIT](http://fredwu.mit-license.org/).