An open API service indexing awesome lists of open source software.

https://github.com/vimal007vimal/malicious-url-detection

Our project employs machine learning to pinpoint phishing URLs with 97.4% accuracy, leveraging HTTPS and website traffic as critical indicators. Insights into features like AnchorURL enhance cybersecurity strategies, showcasing the power of AI in combating online threats.
https://github.com/vimal007vimal/malicious-url-detection

cybersecurity https malicious-url-detection phishing python python3 xgboost-algorithm xgboost-classifier

Last synced: 9 months ago
JSON representation

Our project employs machine learning to pinpoint phishing URLs with 97.4% accuracy, leveraging HTTPS and website traffic as critical indicators. Insights into features like AnchorURL enhance cybersecurity strategies, showcasing the power of AI in combating online threats.

Awesome Lists containing this project

README

          

# Malicious URL Detection
![image](https://github.com/Vimal007Vimal/Malicious-URL-Detection/assets/144089192/8f4cbfc2-e19a-4a17-a9e6-a1f38f320164)
![image](https://github.com/Vimal007Vimal/Malicious-URL-Detection/assets/144089192/c1c36981-bd17-46c8-85e4-0b478761b28c)

## Installation
The Code is written in Python 3.9 If you don't have Python installed you can find it [here](https://www.python.org/downloads/). If you are using a lower version of Python you can upgrade using the pip package, ensuring you have the latest version of pip. To install the required packages and libraries, run this command in the project directory after [cloning](https://www.howtogeek.com/451360/how-to-clone-a-github-repository/) the repository:
```bash
pip install -r requirements.txt
```

## Directory Tree
```
├── static
│   ├── styles.css
├── templates
│   ├── index.html
├── README.md
├── app.py
├── feature.py
├── phishing.csv
├── requirements.txt

```

## Technologies Used

[](https://numpy.org/doc/) [](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html)
[](https://matplotlib.org/)
[](https://scikit-learn.org/stable/)
[](https://flask.palletsprojects.com/en/2.0.x/)

## Result

Accuracy of various model used for URL detection


||ML Model| Accuracy| f1_score| Recall| Precision|
|---|---|---|---|---|---|
0| Gradient Boosting Classifier| 0.974| 0.977| 0.994| 0.986|
1| CatBoost Classifier| 0.972| 0.975| 0.994| 0.989|
2| XGBoost Classifier| 0.969| 0.973| 0.993| 0.984|
3| Multi-layer Perceptron| 0.969| 0.973| 0.995| 0.981|
4| Random Forest| 0.967| 0.971| 0.993| 0.990|
5| Support Vector Machine| 0.964| 0.968| 0.980| 0.965|
6| Decision Tree| 0.960| 0.964| 0.991| 0.993|
7| K-Nearest Neighbors| 0.956| 0.961| 0.991| 0.989|
8| Logistic Regression| 0.934| 0.941| 0.943| 0.927|
9| Naive Bayes Classifier| 0.605| 0.454| 0.292| 0.997|

Feature importance for Malicious URL Detection



![image](https://user-images.githubusercontent.com/79131292/144603941-19044aae-7d7b-4e9a-88a8-6adfd8626f77.png)

Gradient Boosting Classifier currectly classify URL upto 97.4% respective classes and hence reduces the chance of malicious attachments.
\The final conclusion on the Malicious dataset is that the some feature like "HTTTPS", "AnchorURL", "WebsiteTraffic" have more importance to classify URL is Malicious URL or not.
The final take away form this project is to explore various machine learning models, perform Exploratory Data Analysis on Malicious dataset and understanding their features.