Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/rajspeaks/machine-learning-approach-to-bengali-pos-tagging-using-nltk

Bengali POS Tagging using Indian Corpus through NLTK. A sample testing to apply POS Tagging under the supervision of Prof. Sandipan Ganguly, HIT-K.
https://github.com/rajspeaks/machine-learning-approach-to-bengali-pos-tagging-using-nltk

bangla-pos-tagging bangla-postag bengali bengali-natural-language-processing bengali-nlp bengali-pos-tagging indian-language machine-learning natural-language-processing nlp nltk pos-tagger pos-tagging rajdeep-das rajspeaks

Last synced: 2 months ago
JSON representation

Bengali POS Tagging using Indian Corpus through NLTK. A sample testing to apply POS Tagging under the supervision of Prof. Sandipan Ganguly, HIT-K.

Host: GitHub
URL: https://github.com/rajspeaks/machine-learning-approach-to-bengali-pos-tagging-using-nltk
Owner: Rajspeaks
License: gpl-3.0
Created: 2022-01-26T16:32:21.000Z (almost 3 years ago)
Default Branch: main
Last Pushed: 2022-05-02T06:30:28.000Z (over 2 years ago)
Last Synced: 2023-03-09T04:51:59.942Z (almost 2 years ago)
Topics: bangla-pos-tagging, bangla-postag, bengali, bengali-natural-language-processing, bengali-nlp, bengali-pos-tagging, indian-language, machine-learning, natural-language-processing, nlp, nltk, pos-tagger, pos-tagging, rajdeep-das, rajspeaks
Language: Jupyter Notebook
Homepage:
Size: 76.2 KB
Stars: 3
Watchers: 1
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Machine Learning approach to Bengali POS Tagging using NLTK on Indian-Corpus

Indian corpus is a collection of these Indian Languages: Bengali, Hindi, Marathi, and Telugu language data.
NLTK is Natural Language Toolkit Library.

## Methodology

- Here I have imported NLTK(Natural Language Tool Kit).
- Imported indian corpus from NLTK.
- Stored that Indian Corpus into 'bangla.pos'.
- 'bangla.pos' has been stored in a variable 'tagged_set'.
- Stored the bengali sentences from bengali corpus into 'word_set' variable.
- Using for loop to count the number of sentences, present in that corpus.

## Tools & Library requirements:

- Google Colab/Jupyter
- Language: Python
- NLTK Library

## Mentor:

Prof. Sandipan Ganguly

## Developer:

Rajdeep Das

### Reference:

[Click here](https://medium.com/analytics-vidhya/bengali-pos-part-of-speech-tagging-using-indian-corpus-e85f47d3ad65) to read the source article.