https://github.com/snigdho8869/multiclass-text-classification

Natural Language Processing for Multiclass Classification: A repository containing NLP techniques for multiclass classification of text data.
https://github.com/snigdho8869/multiclass-text-classification

adaboost-classifier bert-fine-tuning cnn-model deep-learning ensemble-learning flask flask-application gradient-boosting-classifier gru keras-tensorflow logistic-regression machine-learning natural-language-processing neural-network nlp random-forest-classifier rnn-lstm support-vector-machines text-classification xlnet-fine-tuning

Last synced: 12 days ago
JSON representation

Natural Language Processing for Multiclass Classification: A repository containing NLP techniques for multiclass classification of text data.

Host: GitHub
URL: https://github.com/snigdho8869/multiclass-text-classification
Owner: Snigdho8869
Created: 2023-04-03T19:14:54.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2025-03-16T21:07:19.000Z (3 months ago)
Last Synced: 2025-04-15T06:49:16.993Z (2 months ago)
Topics: adaboost-classifier, bert-fine-tuning, cnn-model, deep-learning, ensemble-learning, flask, flask-application, gradient-boosting-classifier, gru, keras-tensorflow, logistic-regression, machine-learning, natural-language-processing, neural-network, nlp, random-forest-classifier, rnn-lstm, support-vector-machines, text-classification, xlnet-fine-tuning
Language: Jupyter Notebook
Homepage:
Size: 11.3 MB
Stars: 24
Watchers: 2
Forks: 5
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

        # Multiclass Text Classification Project

## Project Overview

The goal of this project is to classify text data into predefined categories using a combination of traditional machine learning models and deep learning architectures. The project includes:

- A **Flask-based web application** for interactive text classification.

- **Preprocessing** of text data, including cleaning, tokenization, and lemmatization.

- Training and evaluation of multiple models, including:

  - Traditional ML models: Logistic Regression, SVM, Naive Bayes, Random Forest, Gradient Boosting, AdaBoost, and an Ensemble model.

  - Deep learning models: LSTM, GRU, CNN, and a hybrid LSTM+CNN model.

  - Fine-tuning of transformer-based models: BERT and XLNet using **ktrain**.

- Visualization of results, including confusion matrices, accuracy plots, and word clouds.

---

# Requirements:

* Python 

* Scikit-learn

* TensorFlow 

* Keras

# Dataset:

The dataset used in this project is the bbc-tex dataset, which consists of approximately 2225 text.

# Results:

The results of each model on the bbc-text dataset are as follows:

|  Model | Accuracy |

|----------|----------|

| Logistic Regression | 96.58% |

| Support Vector Machine | 96.94% |

| Multinomial Naive Bayes | 94.97% |

| Randomforest | 95.15% |

| GradientBoostingClassifier | 94.25% |

| Ensemble Classifier | 97.12% |

| AdaBoost | 94.43% |

| LSTM 1-Layer | 99.22% |

| LSTM 2-Layers | 97.78% |

| GRU | 91.74% |

| CNN+LSTM | 98.73% |

| BERT | 99.60% |

| XLNet | 99.46% |

# Application Interface

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/snigdho8869/multiclass-text-classification

Awesome Lists containing this project

README