Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/kiooku/anti-phishing

Phishing website detection using random forest
https://github.com/kiooku/anti-phishing

Last synced: 24 days ago
JSON representation

Phishing website detection using random forest

Awesome Lists containing this project

README

        

# Anti-Phishing

This is *coursework* for the *big data analytics* class, that **identifies whether a website is legitimate or a phishing site** using **random forest**.

The aim was to learn as much as possible about supervised machine learning, and in the end to create a jupyter notebook on the topic of our choice *(phishing detection in my case)*.

## Overview

The coursework is on a jupyter notebook *(`coursework_phishing_website_detection.ipynb`)* which is 100% reproducible and explains my thinking step by step.

There are several stages in this coursework:
1. Research & Data Exploration
1. Dataset presentation
2. Related Work & Data Exploration
3. Data Pre-processing
2. Modelling/ Classification
3. Solution Improvement

Key words:
- Random Forest Classification
- Gradient Boosted Trees
- Cross-validation
- Randomized Search
- Grid Search
- Fully Homomorphic Encryption Machine Learning

As a bonus, I decided to create a [streamlit](https://streamlit.io/) application to simulate a real-world implementation of an anti-phishing solution based on machine learning.

> **Note** To run the streamlit app that allow you to determine if it's a phishing or legitimate website based on URL do the following command: `streamlit run phishing_website_detection_app.py`

https://github.com/Kiooku/Anti-Phishing/assets/33032066/48abf84a-85a0-4dd1-91cd-63c71e5f55b7