An open API service indexing awesome lists of open source software.

https://github.com/sadiyabhokare/malicious_input_classifier

Malicious Input Classifier for Web Forms
https://github.com/sadiyabhokare/malicious_input_classifier

joblib pandas python sklearn streamlit

Last synced: 2 months ago
JSON representation

Malicious Input Classifier for Web Forms

Awesome Lists containing this project

README

          

# ๐Ÿ›ก๏ธ Malicious Input Classifier for Web Forms

A machine learning-powered application that classifies user-submitted form inputs as:
- ๐ŸŸข Benign
- ๐Ÿ”ด SQL Injection (SQLi)
- ๐ŸŸ  Cross-Site Scripting (XSS)

Built with **scikit-learn** and deployed via **Streamlit**, this project demonstrates a lightweight, real-time defense layer against common web input attacks.

---

## ๐Ÿ“Œ Problem Statement

Web applications often receive malicious inputs through form fields such as:
- Login forms
- Comment boxes
- Search fields

These inputs can result in:
- ๐Ÿ›‘ Data breaches
- ๐Ÿ” Account takeovers
- ๐Ÿ–ผ๏ธ Website defacement

This project aims to mitigate such threats by building a machine learning classifier that detects **SQLi**, **XSS**, or **Benign** inputs in real-time.

---

## ๐ŸŽฏ Objectives

- ๐Ÿ”Ž Analyze and extract key features from form inputs.
- ๐Ÿง  Train a machine learning model to detect attack types.
- ๐Ÿ–ฅ๏ธ Deploy a web-based dashboard for real-time and bulk prediction.
- ๐Ÿšซ Prevent potential malicious inputs before backend processing.

---
## ๐Ÿ“‚ Folder Structure

```
MaliciousInputClassifier/
โ”œโ”€โ”€ app.py
โ”œโ”€โ”€ train.py
โ”œโ”€โ”€ sample_inputs.csv
โ”œโ”€โ”€ requirements.txt
โ”œโ”€โ”€ architecture.png
โ”œโ”€โ”€ model/
โ”‚ โ”œโ”€โ”€ rf_model.pkl
โ”‚ โ””โ”€โ”€ label_encoder.pkl
โ””โ”€โ”€ utils/
โ””โ”€โ”€ feature_extraction.py
```

---

## ๐Ÿงช Sample Predictions

| Input | Prediction |
| ------------------------------- | ---------- |
| `' OR '1'='1` | ๐Ÿ”ด SQLI |
| `alert('xss')` | ๐ŸŸ  XSS |
| `Hello! Great work!` | ๐ŸŸข Benign |

---

## ๐Ÿ› ๏ธ Tech Stack

| Layer | Technology |
| ----------- | ----------------------------- |
| ๐Ÿง  ML Model | `scikit-learn` (RandomForest) |
| ๐Ÿ“Š Data | `pandas`, `.csv` files |
| ๐Ÿ’พ Storage | `joblib` model dumping |
| ๐ŸŽฏ UI | `Streamlit` |

---

## ๐Ÿš€ How to Run Locally

๐Ÿ“ฆ [Download the latest release here](https://github.com/sadiyabhokare/Malicious_Input_Classifier/releases)

### 1. Clone the repository

```bash
git clone https://github.com/your-repo/MaliciousInputClassifier.git
cd MaliciousInputClassifier
```

### 2. Install dependencies

```bash
pip install -r requirements.txt
```

### 3. Train the model

```bash
python train.py
```

### 4. Run the app

```bash
streamlit run app.py
```
---
## ๐Ÿงฑ System Architecture
This project follows a clean and modular architecture that separates UI, feature extraction, model inference, and output presentation.

![System Architecture](architecture.png)
---

## ๐Ÿ“ฝ๏ธ Demo Video

๐Ÿ‘‰ [Click here to watch the demo video](https://drive.google.com/file/d/1tz-SLwx7bo42ai8T1YXPviKZwky7RfuX/view?usp=sharing)

---

## ๐Ÿ‘ฅ Team Members & Contributions

| Name | Role and Contributions |
|-----------------------------------------------------------|------------------------------------------------------------------|
| [Rabiya Gavandi](https://github.com/Rabiya786-hash) | ๐Ÿง  ML Model Design, Feature Engineering, Model Training |
| [Saniya Kalawant](https://github.com/SaniyaKalawant) | ๐Ÿ’ป Frontend Development using Streamlit, UI Design, Input Modes |
| [Sadiya Bhokare](https://github.com/sadiyabhokare) | ๐Ÿ“ฆ Integration, Testing, Deployment Setup, Documentation, Report |