https://github.com/torfsen/purestemmer
Pure-Python implementations of the Snowball stemming algorithms.
https://github.com/torfsen/purestemmer
Last synced: 26 days ago
JSON representation
Pure-Python implementations of the Snowball stemming algorithms.
- Host: GitHub
- URL: https://github.com/torfsen/purestemmer
- Owner: torfsen
- License: other
- Created: 2014-08-05T17:50:31.000Z (almost 12 years ago)
- Default Branch: master
- Last Pushed: 2014-08-08T11:47:41.000Z (almost 12 years ago)
- Last Synced: 2025-01-02T18:25:07.512Z (over 1 year ago)
- Language: Python
- Size: 1.83 MB
- Stars: 1
- Watchers: 3
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.rst
- Changelog: CHANGES
- License: LICENSE
Awesome Lists containing this project
README
*purestemmer* - A pure-Python implementation of the Snowball stemmers
#####################################################################
The traditional way of using the `Snowball stemmers`_ in Python is via
the pystemmer_ package, which provides a Python wrapper around the
Snowball C library. However, Python C extensions are problematic in
some environments. Therefore, this package provides pure-Python
implementations of the Snowball stemming algorithms.
The implementations of the stemming algorithms is translated from the
Snowball language to Python via sbl2py_.
.. _`Snowball stemmers`: http://snowball.tartarus.org/
.. _pystemmer: https://pypi.python.org/pypi/PyStemmer
.. _sbl2py: https://pypi.python.org/pypi/sbl2py
Installation
============
Installing *purestemmer* is easy using pip_::
pip install purestemmer
.. _pip: http://pip.readthedocs.org/en/latest/index.html
Usage
=====
Usually, you'll prefer to use the *pystemmer* module whenever that is
possible, because it's much faster than *purestemmer*::
try:
import Stemmer
except ImportError:
# pystemmer is not available, use purestemmer instead
import purestemmer as Stemmer
Since *purestemmer* has the same public API and provides the same
algorithms as *pystemmer*, there should be no need to change any code
when switching between *pystemmer* and *purestemmer* like this.
Please see the *pystemmer* documentation for details on how to use the
stemming algorithms.
Differences between *purestemmer* and *pystemmer*
=================================================
* *purestemmer* has only been tested on Python 2.7
* ``purestemmer.Stemmer`` instances are thread-safe
* *purestemmer* is on average about 100x slower than *pystemmer*
License
=======
*purestemmer* itself is covered by the `MIT License`_. The underlying
Snowball algorithms are covered by the `BSD-3 License`_. Please see the
``LICENSE`` file for details.
.. _`MIT License`: http://opensource.org/licenses/MIT
.. _`BSD-3 License`: http://opensource.org/licenses/BSD-3-Clause