https://github.com/rohithgowdam/cyberbullying-classification

The project deals with the identification of high accuracy model among the given models to detect the cyberbullying in text by training them with the given dataset which is preprocessed and vectorized with tf-idf
https://github.com/rohithgowdam/cyberbullying-classification

classification cyberbullying-detection decision-trees logistic-regression machine-learning mlproject naive-bayes-classifier preprocessing random-forest tf-idf tf-idf-vectorizer tweets vectorization

Last synced: 3 months ago
JSON representation

Host: GitHub
URL: https://github.com/rohithgowdam/cyberbullying-classification
Owner: RohithgowdaM
License: apache-2.0
Created: 2024-11-27T18:50:27.000Z (6 months ago)
Default Branch: main
Last Pushed: 2025-02-16T11:00:14.000Z (3 months ago)
Last Synced: 2025-02-16T12:17:23.344Z (3 months ago)
Topics: classification, cyberbullying-detection, decision-trees, logistic-regression, machine-learning, mlproject, naive-bayes-classifier, preprocessing, random-forest, tf-idf, tf-idf-vectorizer, tweets, vectorization
Language: Jupyter Notebook
Homepage:
Size: 2.91 MB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.MD
- License: LICENSE

Awesome Lists containing this project

README

# Cyberbullying Classification Model

This project aims to build a machine learning model to classify cyberbullying content on social media based on tweets. The goal is to predict different types of cyberbullying, including age, ethnicity, gender, religion, other cyberbullying, and non-cyberbullying.

## Project Overview

- **Objective**: Create a classification model to predict cyberbullying types in tweets.
- **Dataset**: The dataset consists of tweets labeled with the type of cyberbullying they contain. The types include:
- Age
- Ethnicity
- Gender
- Religion
- Not Cyberbullying
- Other Cyberbullying

- **Algorithms Used**:
- Naïve Bayes
- Logistic Regression
- Decision Tree
- Random Forest

- **Key Features**:
- Text preprocessing (including tokenization, stemming, and stopword removal)
- TF-IDF vectorization for feature extraction
- Evaluation using confusion matrix, classification report, and accuracy score

## Installation

To run this project, you need to have Python installed along with the required libraries. You can install the dependencies using the `requirements.txt` file.

### Steps to Install:

1. Clone the repository to your local machine:
```bash
git clone https://github.com/RohithgowdaM/cyberbullying-classification.git
```

2. Navigate to the project directory:
```bash
cd cyberbullying-classification
```

3. Install the required Python packages:
```bash
pip install -r requirements.txt
```

## Usage

1. **Data Preprocessing**: The script processes the tweet data by performing tasks like tokenization, lemmatization, and stopword removal.
2. **Model Training and Evaluation**: Different machine learning models (Naïve Bayes, Logistic Regression, Decision Tree, Random Forest) are trained, and their performance is evaluated using accuracy, confusion matrix, and classification report.
3. **Visualization**: The confusion matrices for each model are displayed using heatmaps for better understanding of model performance.

To run the project, you can execute the main script (e.g., `program.py` or `main.py`).

```bash
python main.py
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/rohithgowdam/cyberbullying-classification

Awesome Lists containing this project

README