An open API service indexing awesome lists of open source software.

https://github.com/jessicasaikia/rule-based

This repository contains a simple Rule-Based Model for Parts-of-Speech tagging in Assamese-English code mixed texts.
https://github.com/jessicasaikia/rule-based

assamese assamese-english assamese-language assamese-text code-mixed code-mixing english english-language nlp nlp-machine-learning parts-of-speech parts-of-speech-tagging pos-tagger pos-tagging rule-based rule-based-modeling rule-based-nlp

Last synced: about 1 month ago
JSON representation

This repository contains a simple Rule-Based Model for Parts-of-Speech tagging in Assamese-English code mixed texts.

Awesome Lists containing this project

README

        

# **Rule-Based Model**
This repository contains a simple Rule-Based Model for Parts-of-Speech tagging in Assamese-English code mixed texts.

## Introduction to Parts-of-Speech Tagging (PoS Tagging)
PoS tagging is the process that identifies and labels grammatical roles of words in texts, supporting applications like machine translation and sentiment analysis. While different languages may have their own PoS tags, I have used my own custom PoS tags for this model. The Table below defines the custom PoS tags used in this model-

![Table](https://github.com/jessicasaikia/rule-based/blob/main/Custom%20PoS%20tags%20Table.png)

## How does this work?
1. The code starts by importing all the necessary libraries.
2. Following that I added the "dictionaries" which are CSV files containing words and their respective Parts of Speech tags. I made two dictionaries - one for the English language (containing English words and their parts of speech tags) and the other for the Assamese language.
3. After that, I simply run the code, which then asks for my input in the form of a "CSV file" or a "sentence" for parts of speech tagging.

## Where should you run this code?
I used **Google Colab** for this Model.
1. Simply create a new notebook (or file) on Google Colab.
2. Paste the code.
3. Upload your dictionaries to Google Colab.
4. Please make sure that you update the "dictionaries" part of the code based on your CSV file names and file path.
5. Run the code.
6. Enter your preferred input type (CSV or Sentence)
7. The output will be displayed and saved as a different CSV file.

You can also VScode or any other platform (this code is just a python a code)
1. In this case, you will have to make sure you have the necessary libraries installed and dictionaries loaded correctly.
2. Simply run the program for the output.

## Additional Notes from me
In case of any help or queries, you can reach out to me in the comments or via my socials.
My socials are:
- Discord: jessicasaikia
- Instagram: jessicasaikiaa
- LinkedIn: jessicasaikia (www.linkedin.com/in/jessicasaikia-787a771b2)

Additionally, you can find the custom dictionaries that I have used in this project and the dataset in their respective repositories on my profile. Have fun coding and good luck! :D