Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/udacity-machinelearning-internship/more-spam-classifying

Implementing more spam classifying using Ensemble Methods in python
https://github.com/udacity-machinelearning-internship/more-spam-classifying

classification jupyter jupyter-notebook machine-learning pandas python scikit-learn sklearn

Last synced: about 6 hours ago
JSON representation

Implementing more spam classifying using Ensemble Methods in python

Awesome Lists containing this project

README

        

![More_Spam_Classifying](https://github.com/BaraSedih11/More-Spam-Classifying/assets/98843912/53bfa311-f9dc-42aa-94d4-d1e6fa971b07)

![GitHub repo size](https://img.shields.io/github/repo-size/BaraSedih11/More-Spam-Classifying) ![GitHub repo file count (file type)](https://img.shields.io/github/directory-file-count/BaraSedih11/More-Spam-Classifying) [![Python Version](https://img.shields.io/badge/python-3.8-blue)](https://www.python.org/downloads/release/python-380/)
[![Pip Version](https://img.shields.io/badge/pip-21.0-orange)](https://pypi.org/project/pip/21.0/)
![GitHub last commit (branch)](https://img.shields.io/github/last-commit/BaraSedih11/More-Spam-Classifying/main)
[![Version](https://img.shields.io/badge/version-v1.0.0-blue)](https://github.com/BaraSedih11/Support-Vector-Machine/releases/tag/v1.0.0)
[![Contributors](https://img.shields.io/github/contributors/BaraSedih11/More-Spam-Classifying)](https://github.com/BaraSedih11/More-Spam-Classifying/graphs/contributors)
![GitHub pull requests](https://img.shields.io/github/issues-pr-raw/BaraSedih11/More-Spam-Classifying)


This repository contains an Implementing more spam classifying using Ensemble Methods in python.

## Overview

### Ensemble Methods

In order to find a way to optimize for both variance and bias, we have ensemble methods. Ensemble methods have become some of the most popular methods used to compete in competitions on Kaggle and used in industry across applications.

There were two randomization techniques you saw to combat overfitting:

* Bootstrap the data - that is, sampling the data with replacement and fitting your algorithm and fitting your algorithm to the sampled data.
* Subset the features - in each split of a decision tree or with each algorithm used an ensemble only a subset of the total possible features are used.

## Contents

- `Spam_&_Ensembles.ipynb`: Jupyter Notebook containing the implementation of SVM's using Python.
- `README.md`: This file providing an overview of the repository.

## Requirements

To run the code in the Jupyter Notebook, you need to have Python installed on your system along with the following libraries:

- NumPy
- pandas
- scikit-learn

You can install these libraries using pip:

```bash
pip install numpy pandas scikit-learn
```

## Usage

1. Clone this repository to your local machine:

```bash
git clone https://github.com/BaraSedih11/More-Spam-Classifying.git
```

2. Navigate to the repository directory:

```bash
cd More-Spam-Classifying
```

3. Open and run the Jupyter Notebook `Spam_&_Ensembles.ipynb` using Jupyter Notebook or JupyterLab.

4. Follow along with the code and comments in the notebook to understand how Ensemble methods is implemented using Python.

## Acknowledgements

- [scikit-learn](https://scikit-learn.org/): The scikit-learn library for machine learning in Python.
- [NumPy](https://numpy.org/): The NumPy library for numerical computing in Python.
- [pandas](https://pandas.pydata.org/): The pandas library for data manipulation and analysis in Python.