https://github.com/sadiyabhokare/malicious_input_classifier
Malicious Input Classifier for Web Forms
https://github.com/sadiyabhokare/malicious_input_classifier
joblib pandas python sklearn streamlit
Last synced: 2 months ago
JSON representation
Malicious Input Classifier for Web Forms
- Host: GitHub
- URL: https://github.com/sadiyabhokare/malicious_input_classifier
- Owner: sadiyabhokare
- Created: 2025-07-01T23:46:12.000Z (12 months ago)
- Default Branch: main
- Last Pushed: 2025-07-04T05:36:16.000Z (12 months ago)
- Last Synced: 2025-07-14T10:38:16.315Z (11 months ago)
- Topics: joblib, pandas, python, sklearn, streamlit
- Language: Python
- Homepage:
- Size: 207 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# ๐ก๏ธ Malicious Input Classifier for Web Forms
A machine learning-powered application that classifies user-submitted form inputs as:
- ๐ข Benign
- ๐ด SQL Injection (SQLi)
- ๐ Cross-Site Scripting (XSS)
Built with **scikit-learn** and deployed via **Streamlit**, this project demonstrates a lightweight, real-time defense layer against common web input attacks.
---
## ๐ Problem Statement
Web applications often receive malicious inputs through form fields such as:
- Login forms
- Comment boxes
- Search fields
These inputs can result in:
- ๐ Data breaches
- ๐ Account takeovers
- ๐ผ๏ธ Website defacement
This project aims to mitigate such threats by building a machine learning classifier that detects **SQLi**, **XSS**, or **Benign** inputs in real-time.
---
## ๐ฏ Objectives
- ๐ Analyze and extract key features from form inputs.
- ๐ง Train a machine learning model to detect attack types.
- ๐ฅ๏ธ Deploy a web-based dashboard for real-time and bulk prediction.
- ๐ซ Prevent potential malicious inputs before backend processing.
---
## ๐ Folder Structure
```
MaliciousInputClassifier/
โโโ app.py
โโโ train.py
โโโ sample_inputs.csv
โโโ requirements.txt
โโโ architecture.png
โโโ model/
โ โโโ rf_model.pkl
โ โโโ label_encoder.pkl
โโโ utils/
โโโ feature_extraction.py
```
---
## ๐งช Sample Predictions
| Input | Prediction |
| ------------------------------- | ---------- |
| `' OR '1'='1` | ๐ด SQLI |
| `alert('xss')` | ๐ XSS |
| `Hello! Great work!` | ๐ข Benign |
---
## ๐ ๏ธ Tech Stack
| Layer | Technology |
| ----------- | ----------------------------- |
| ๐ง ML Model | `scikit-learn` (RandomForest) |
| ๐ Data | `pandas`, `.csv` files |
| ๐พ Storage | `joblib` model dumping |
| ๐ฏ UI | `Streamlit` |
---
## ๐ How to Run Locally
๐ฆ [Download the latest release here](https://github.com/sadiyabhokare/Malicious_Input_Classifier/releases)
### 1. Clone the repository
```bash
git clone https://github.com/your-repo/MaliciousInputClassifier.git
cd MaliciousInputClassifier
```
### 2. Install dependencies
```bash
pip install -r requirements.txt
```
### 3. Train the model
```bash
python train.py
```
### 4. Run the app
```bash
streamlit run app.py
```
---
## ๐งฑ System Architecture
This project follows a clean and modular architecture that separates UI, feature extraction, model inference, and output presentation.

---
## ๐ฝ๏ธ Demo Video
๐ [Click here to watch the demo video](https://drive.google.com/file/d/1tz-SLwx7bo42ai8T1YXPviKZwky7RfuX/view?usp=sharing)
---
## ๐ฅ Team Members & Contributions
| Name | Role and Contributions |
|-----------------------------------------------------------|------------------------------------------------------------------|
| [Rabiya Gavandi](https://github.com/Rabiya786-hash) | ๐ง ML Model Design, Feature Engineering, Model Training |
| [Saniya Kalawant](https://github.com/SaniyaKalawant) | ๐ป Frontend Development using Streamlit, UI Design, Input Modes |
| [Sadiya Bhokare](https://github.com/sadiyabhokare) | ๐ฆ Integration, Testing, Deployment Setup, Documentation, Report |