Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/akashkg03/spam-email-classification

This notebook involves to build a spam email classifier using Naive bayes and feature extraction technique using countvectorizer
https://github.com/akashkg03/spam-email-classification

classification countvectorizer jupiter-notebook naive-bayes-classifier pandas python

Last synced: about 2 months ago
JSON representation

This notebook involves to build a spam email classifier using Naive bayes and feature extraction technique using countvectorizer

Awesome Lists containing this project

README

        

## Spam-Email-Classification

### Project Overview
In the project, the objective was to develop a machine learning model to classify emails as either spam or non-spam (ham). Email spam classification is a common problem in natural language processing (NLP) and has significant applications in email filtering systems.

### Problem Statement
The goal of the project was to build a classifier that can accurately differentiate between spam and non-spam emails. That involves preprocessing the email text data, extracting relevant features, training a classification model, and evaluating its performance.

### Dataset
Used a publicly available dataset containing labeled emails, where each email is classified as spam or ham. The dataset consists of both the email text and corresponding labels.

### Approach
The approach involved the following steps:
1. Imported necessary libraries for data processing and model building.
2. Data preparation, including loading the dataset, cleaning, and preprocessing.
3. Feature extraction to convert the text data into numerical features.
4. Model Trained using a classification algorithm.
5. Evaluated model's performance using appropriate metrics.

### Results:
Achieved a accuracy of 99.19% on the test dataset, indicating the model's ability to accurately classify emails.

### Technologies Used:
Python, pandas, scikit-learn, Jupyter Notebook.

### Skills Demonstrated:
Data preprocessing, feature extraction, classification modeling, model evaluation.