Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/kmohamedalie/phishing-websites

Detecting supicious website using machine learning with and accuracy of 97.01%
https://github.com/kmohamedalie/phishing-websites

classification computer-science cybersecurity hacking machine-learning phising random-forest support-vector-machines

Last synced: 7 days ago
JSON representation

Detecting supicious website using machine learning with and accuracy of 97.01%

Host: GitHub
URL: https://github.com/kmohamedalie/phishing-websites
Owner: Kmohamedalie
License: mit
Created: 2023-08-15T18:38:57.000Z (over 1 year ago)
Default Branch: master
Last Pushed: 2023-08-20T10:59:48.000Z (about 1 year ago)
Last Synced: 2024-01-27T02:10:24.241Z (10 months ago)
Topics: classification, computer-science, cybersecurity, hacking, machine-learning, phising, random-forest, support-vector-machines
Language: Jupyter Notebook
Homepage: https://github.com/Kmohamedalie/Phishing-Websites
Size: 1010 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

![image](https://github.com/Kmohamedalie/Phishing-Websites/assets/63104472/505ed2f7-09d6-45c2-aca8-582564bd2c15)

### **Complete JupyterNotebook:** [Link](https://github.com/Kmohamedalie/Phishing-Websites/tree/master/Notebook)

### **Metrics:**

| Algorithm | Precision | Recall | F1-score | Accuracy |
|-----------|-----------|--------|----------|----------|
| Xgboost | 97.01% | 97.01% | 97.01% | 97.01% |

### **Additional Information about the dataset**
Creators: Rami Mohammad, Lee McCluskey

This dataset collected mainly from: PhishTank archive, MillerSmiles archive, Googleâ€™s searching operators.

One of the challenges faced by our research was the unavailability of reliable training datasets. In fact this challenge faces any researcher in the field. However, although plenty of articles about predicting phishing websites have been disseminated these days, no reliable training dataset has been published publically, may be because there is no agreement in literature on the definitive features that characterize phishing webpages, hence it is difficult to shape a dataset that covers all possible features.
In this dataset, we shed light on the important features that have proved to be sound and effective in predicting phishing websites. In addition, we propose some new features.