Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/arnavk-09/phishing-detection
🎣 Detect Phishing URLs with Data Pre-fitted... API & Web UI
https://github.com/arnavk-09/phishing-detection
csv data fastapi flask python scikit-learn
Last synced: about 1 month ago
JSON representation
🎣 Detect Phishing URLs with Data Pre-fitted... API & Web UI
- Host: GitHub
- URL: https://github.com/arnavk-09/phishing-detection
- Owner: ArnavK-09
- Created: 2023-07-24T09:25:23.000Z (over 1 year ago)
- Default Branch: with/flask
- Last Pushed: 2023-08-12T16:45:01.000Z (over 1 year ago)
- Last Synced: 2023-09-04T09:47:45.280Z (over 1 year ago)
- Topics: csv, data, fastapi, flask, python, scikit-learn
- Language: HTML
- Homepage: https://phishing-site-1q5k.onrender.com
- Size: 19.3 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# **Phishing URL Detection Website**
![Python](https://img.shields.io/badge/Python-3.8%2B-blue.svg)
![FastAPI](https://img.shields.io/badge/Flask-2.1%2B-red.svg)
![License](https://img.shields.io/github/license/ArnavK-09/phishing-detection)## 📚 Overview
This repository contains a FastAPI-based web API that helps determine if a URL is bad (potentially phishing) or good (not malicious). The API uses a Machine Learning model trained on a dataset of over 4,000 URLs, categorizing them as "bad" or "good".
The model is based on a Logistic Regression classifier using the Term Frequency-Inverse Document Frequency (TF-IDF) vectorizer for text representation. It has been pre-trained on the provided dataset of URLs and can quickly classify new URLs.
## 🚚 Installation
1. Clone this repository to your local machine:
```bash
git clone https://github.com/yourusername/phishing-url-detection.git
cd phishing-url-detection
```2. Install the required dependencies using `pip`:
```bash
pip install flask pandas scikit-learn gunicorn
```## 🖥️ Using Website or API
1. Prepare your dataset:
- Ensure you have a CSV file named `data/main.csv` containing the list of URLs to be categorized.
- The CSV file should have two columns: `URL` (containing the URLs) and `Label` (with values "bad" or "good" indicating the classification).2. Prepare Model with Dataset
```bash
python3 model.py
````3. Start the Flask server:
```bash
gunicorn app:app
```4. Access the Web Interface:
- Open your web browser or a tool like Postman.
- Go to `http://localhost:8000` to view the API introduction and server information.5. Check URLs API for Phishing:
- To check if a specific URL is bad or good, use the `/checkurl` endpoint with the `url` parameter:
- **Request:**
```
GET http://localhost:8000/checkurl?url=https://example.com
```- **Response:**
```
{
"url": "https://example.com",
"result": "safe" || "harmful"
}
```The `type` field can have values "good" or "bad," indicating the classification result.
## ⚡ Key Points
- 🚀 - Exciting features and blazing-fast performance.
- 💡 - Insightful explanations and helpful tips.
- 📝 - Clear and concise code blocks.
- ⚔️ - API plus Website UI
- ✨ - Beautiful Materialize UI for Website
- 📦 - Simple installation and setup instructions.
- 🤖 - Smart Machine Learning model behind the scenes.
- 🔒 - Improved security with URL classification.# 📃 License
This project is licensed under the 'Unlicense' License - see the [LICENSE](LICENSE) file for details.
---
> **For Fast API Version:- [Click Here](https://github.com/ArnavK-09/phishing-detection/tree/with/fastapi)**