https://github.com/hirudikaanupama/email-spam-detection-logistic-regression
This model can predict whether an email is spam or not. The logistic regression machine learning algorithm is used to train this model.
https://github.com/hirudikaanupama/email-spam-detection-logistic-regression
accuracy-score classification classification-report confusionmatrix data-visualization logistic-regression machine-learning roc-curve
Last synced: 25 days ago
JSON representation
This model can predict whether an email is spam or not. The logistic regression machine learning algorithm is used to train this model.
- Host: GitHub
- URL: https://github.com/hirudikaanupama/email-spam-detection-logistic-regression
- Owner: HirudikaAnupama
- Created: 2024-07-26T21:08:52.000Z (about 1 year ago)
- Default Branch: master
- Last Pushed: 2024-08-11T18:21:18.000Z (about 1 year ago)
- Last Synced: 2025-06-22T07:44:01.852Z (4 months ago)
- Topics: accuracy-score, classification, classification-report, confusionmatrix, data-visualization, logistic-regression, machine-learning, roc-curve
- Language: Jupyter Notebook
- Homepage:
- Size: 2.8 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
Email Spam Detection Using Logistic Regression
Most of the logistic regression theory is covered in this project.
Using these theories, We can identify whether emails are spam or not
## Introduction
- When we train the machine learning model we need to follow several steps.
- While training this linear regression machine learning algorithm, we need to follow some steps to make the model accurate and fast. Amoung them are things like,##### _Data Collecting_
##### _Data Preprocessing_
##### _Data Analysis_
##### _Split the Data into training and testing_
##### _Evaluate the Model_
##### _Check model performance_
##### _Fine-tune the Model_
- A better understanding of these can be obtained from the following introduction and relative code sections related to the introduction can be obtained by observing the code.
### Data Collecting
- We must collect the data we need according to our needs.
- Depending on the target variable (dependent variable/ our predictor variable/ y) we need to collect other data (characteristics/ independent variables/ Features).### Data Preprocessing
- After collecting the data we need to clean it,
##### _find missing data and fill them_
##### _Drop duplicate data_
##### _Turn categorical data into numerical or Boolean_
##### _Rename columns for easily understand_
##### _Separate target value and features_
- We can use Encoding method or dummy method for convert categorical data into numerical or boolean.### Data Analysis
- We can analyze the relationships between the target and the features using plots, graphs, etc.
- We can identify the relationship through the following sample examples.