https://github.com/sazid1462/py-bangla-stemmer
Rule based Bengali Stemmer written in python
https://github.com/sazid1462/py-bangla-stemmer
bangla bengali rule-based-stemmer stemmer
Last synced: 2 months ago
JSON representation
Rule based Bengali Stemmer written in python
- Host: GitHub
- URL: https://github.com/sazid1462/py-bangla-stemmer
- Owner: sazid1462
- License: gpl-3.0
- Created: 2019-04-11T09:04:56.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2019-07-03T09:07:03.000Z (almost 7 years ago)
- Last Synced: 2025-12-17T02:47:06.432Z (6 months ago)
- Topics: bangla, bengali, rule-based-stemmer, stemmer
- Language: Python
- Size: 37.1 KB
- Stars: 0
- Watchers: 1
- Forks: 2
- Open Issues: 1
-
Metadata Files:
- Readme: README.rst
- License: LICENSE
Awesome Lists containing this project
README
===========================
Rule Based Bangla Stemmer
===========================
**contents**
- `Installation`_
- `Usage`_
- `Rules Documentation (Only for Development)`_
Installation
-------------
.. code-block:: bash
$ pip install py_bangla_stemmer
`Usage`_
----------
.. code-block:: python
>>> from py_bangla_stemmer import BanglaStemmer
>>>
>>> stemmer = BanglaStemmer()
>>> stemmer.stem('জনপ্রিয়তা') # 'জনপ্রি'
>>> stemmer.stem(' সেটাই') # 'সে'
`Rules Documentation (Only for Development)`_
----------------------------------------------
Following documentations are for the further development of the stemmer. There is a file in ``py_bangla_stemmer/resources`` folder named ``common.rules``. Bellow are the information required to know to change the rules.
.. code-block:: math
X + n :
When X appears at the end of a word and word length is at least n, remove it
.. code-block:: math
Y -> Z + n :
When Y appears at the end of a word and word length is at least n, replace it with Z
.. code-block:: math
Y.Z -> A.B + n :`
When Y, followed by some character a, followed by Z appears at the end of a word
and word length is at least n, replace it with AaB.