Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/rajspeaks/machine-learning-approach-to-bengali-pos-tagging-using-nltk
Bengali POS Tagging using Indian Corpus through NLTK. A sample testing to apply POS Tagging under the supervision of Prof. Sandipan Ganguly, HIT-K.
https://github.com/rajspeaks/machine-learning-approach-to-bengali-pos-tagging-using-nltk
bangla-pos-tagging bangla-postag bengali bengali-natural-language-processing bengali-nlp bengali-pos-tagging indian-language machine-learning natural-language-processing nlp nltk pos-tagger pos-tagging rajdeep-das rajspeaks
Last synced: 2 months ago
JSON representation
Bengali POS Tagging using Indian Corpus through NLTK. A sample testing to apply POS Tagging under the supervision of Prof. Sandipan Ganguly, HIT-K.
- Host: GitHub
- URL: https://github.com/rajspeaks/machine-learning-approach-to-bengali-pos-tagging-using-nltk
- Owner: Rajspeaks
- License: gpl-3.0
- Created: 2022-01-26T16:32:21.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2022-05-02T06:30:28.000Z (over 2 years ago)
- Last Synced: 2023-03-09T04:51:59.942Z (almost 2 years ago)
- Topics: bangla-pos-tagging, bangla-postag, bengali, bengali-natural-language-processing, bengali-nlp, bengali-pos-tagging, indian-language, machine-learning, natural-language-processing, nlp, nltk, pos-tagger, pos-tagging, rajdeep-das, rajspeaks
- Language: Jupyter Notebook
- Homepage:
- Size: 76.2 KB
- Stars: 3
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Machine Learning approach to Bengali POS Tagging using NLTK on Indian-Corpus
Indian corpus is a collection of these Indian Languages: Bengali, Hindi, Marathi, and Telugu language data.
NLTK is Natural Language Toolkit Library.## Methodology
- Here I have imported NLTK(Natural Language Tool Kit).
- Imported indian corpus from NLTK.
- Stored that Indian Corpus into 'bangla.pos'.
- 'bangla.pos' has been stored in a variable 'tagged_set'.
- Stored the bengali sentences from bengali corpus into 'word_set' variable.
- Using for loop to count the number of sentences, present in that corpus.## Tools & Library requirements:
- Google Colab/Jupyter
- Language: Python
- NLTK Library## Mentor:
Prof. Sandipan Ganguly
## Developer:
Rajdeep Das
### Reference:
[Click here](https://medium.com/analytics-vidhya/bengali-pos-part-of-speech-tagging-using-indian-corpus-e85f47d3ad65) to read the source article.