An open API service indexing awesome lists of open source software.

https://github.com/hirudikaanupama/email-spam-detection-logistic-regression

This model can predict whether an email is spam or not. The logistic regression machine learning algorithm is used to train this model.
https://github.com/hirudikaanupama/email-spam-detection-logistic-regression

accuracy-score classification classification-report confusionmatrix data-visualization logistic-regression machine-learning roc-curve

Last synced: 25 days ago
JSON representation

This model can predict whether an email is spam or not. The logistic regression machine learning algorithm is used to train this model.

Awesome Lists containing this project

README

          

Email Spam Detection Using Logistic Regression


Most of the logistic regression theory is covered in this project.


Using these theories, We can identify whether emails are spam or not



## Introduction
- When we train the machine learning model we need to follow several steps.
- While training this linear regression machine learning algorithm, we need to follow some steps to make the model accurate and fast. Amoung them are things like,

##### _Data Collecting_
##### _Data Preprocessing_
##### _Data Analysis_
##### _Split the Data into training and testing_
##### _Evaluate the Model_
##### _Check model performance_
##### _Fine-tune the Model_

- A better understanding of these can be obtained from the following introduction and relative code sections related to the introduction can be obtained by observing the code.


### Data Collecting

- We must collect the data we need according to our needs.
- Depending on the target variable (dependent variable/ our predictor variable/ y) we need to collect other data (characteristics/ independent variables/ Features).

### Data Preprocessing

- After collecting the data we need to clean it,

##### _find missing data and fill them_
##### _Drop duplicate data_
##### _Turn categorical data into numerical or Boolean_
##### _Rename columns for easily understand_
##### _Separate target value and features_

- We can use Encoding method or dummy method for convert categorical data into numerical or boolean.

### Data Analysis

- We can analyze the relationships between the target and the features using plots, graphs, etc.
- We can identify the relationship through the following sample examples.