Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/steveklabnik/rust-stem-porter2
Porter 2 English Stemmer implemented in Rust
https://github.com/steveklabnik/rust-stem-porter2
Last synced: 4 days ago
JSON representation
Porter 2 English Stemmer implemented in Rust
- Host: GitHub
- URL: https://github.com/steveklabnik/rust-stem-porter2
- Owner: steveklabnik
- License: mit
- Created: 2014-10-30T04:31:11.000Z (about 10 years ago)
- Default Branch: master
- Last Pushed: 2014-09-24T01:49:59.000Z (about 10 years ago)
- Last Synced: 2024-11-01T01:24:42.219Z (about 2 months ago)
- Size: 301 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
Porter2 English Stemmer
=======================[![Build Status](https://travis-ci.org/carols10cents/rust-stem-porter2.svg)](https://travis-ci.org/carols10cents/rust-stem-porter2)
This is an INCOMPLETE implementation of the [Porter2 english stemmer](http://snowball.tartarus.org/algorithms/english/stemmer.html) written in Rust. It's a little toy project for me to learn Rust on, while doing something somewhat useful.
I'm currently using rustc 0.12.0-nightly (>72841b128 2014-09-21 20:00:29 +0000) in order to get cargo.
Many thanks to the start that [mrordinaire's porter stemmer in rust](https://github.com/mrordinaire/rust-stem) gave me!!
Compiling
=========I'm using [Cargo](http://crates.io/)!!! Just run `cargo build`!!!!
Running the tests
=================I'm using [Cargo](http://crates.io/)!!! Just run `cargo test`!!!!
The tests are really just one test with a lot of cases-- it runs through the words in `test-data/voc.txt` and asserts that the stem of the word matches the corresponding line in `test-data/porter2-output.txt`.
Stemming
========After compiling, you should have a binary in target/stem that will read a list of words, one per line, from stdin and print their stems to stdout.
Example:
./target/stem < test-data/voc.txt > output.txt
License
=======MIT. See LICENSE.