https://github.com/yuu18id/vit-ela-ai-image-classifier

AI-powered image classifier using Vision Transformer and Error Level Analysis to detect AI-Generated image.
https://github.com/yuu18id/vit-ela-ai-image-classifier

ai-generated-images error-level-analysis flask image-classification vision-transformer website

Last synced: 4 days ago
JSON representation

AI-powered image classifier using Vision Transformer and Error Level Analysis to detect AI-Generated image.

Host: GitHub
URL: https://github.com/yuu18id/vit-ela-ai-image-classifier
Owner: Yuu18id
License: mit
Created: 2025-10-01T07:29:36.000Z (8 days ago)
Default Branch: main
Last Pushed: 2025-10-01T09:21:27.000Z (8 days ago)
Last Synced: 2025-10-01T11:23:18.202Z (8 days ago)
Topics: ai-generated-images, error-level-analysis, flask, image-classification, vision-transformer, website
Language: HTML
Homepage:
Size: 363 KB
Stars: 1
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# AI Image Classifier with ViT and ELA

A web application for image classification using Vision Transformer (ViT) with Error Level Analysis (ELA) preprocessing. This system is designed to detect ai-generated image or classify images based on your custom dataset.

## 🌟 Key Features

- **Custom Model Training**: Train ViT model with your own dataset
- **ELA Preprocessing**: Detect image manipulation using Error Level Analysis
- **Real-time Progress**: Monitor training and evaluation progress in real-time
- **Model Management**: Load and unload models easily
- **Comprehensive Evaluation**: Complete confusion matrix and classification report
- **Fast Inference**: Classify images with confidence scores

## 🏗️ Architecture

- **Model**: Vision Transformer (ViT) with custom configuration
- Image size: 224x224
- Patch size: 16x16
- Hidden size: 384
- 6 attention heads
- 6 transformer layers
- **Preprocessing**: Error Level Analysis (ELA) for manipulation detection
- **Framework**: PyTorch + Transformers (Hugging Face)
- **Backend**: Flask
- **Frontend**: HTML + JavaScript (AJAX)

## 🚀 Installation

1. **Clone repository**
```bash
git clone https://github.com/Yuu18id/vit-ela-ai-image-classifier
cd vit-ela-ai-image-classifier
```

2. **Python Dependencies**

For CUDA-enabled systems (GPU):
```
pip install -r requirements-gpu.txt
```

For CPU-only systems:
```
pip install -r requirements-cpu.txt
```

3. **Run the application**
```bash
python app.py
```

The application will run at `http://localhost:5000`

## 📁 Dataset Structure

The dataset must be organized in the following folder structure:

```
dataset/
├── class1/
│ ├── image1.jpg
│ ├── image2.jpg
│ └── ...
├── class2/
│ ├── image1.jpg
│ ├── image2.jpg
│ └── ...
└── class3/
├── image1.jpg
└── ...
```

## 💻 Usage Guide

### 1. Training Model

1. Open the **Train** page from the navigation menu
2. Select dataset folder using the "Choose Folder" button
3. Configure hyperparameters:
- **Learning Rate**: Default 1e-6 (recommended 1e-6 to 1e-4)
- **Epochs**: Number of training epochs (default: 10)
- **Batch Size**:
- GPU: 32 (default), can go higher with more VRAM
- CPU: 8-16 recommended (lower to avoid memory issues)
- **Train Split**: Validation data proportion (default: 0.2 = 20%)
4. Click **Start Training**
5. Monitor real-time progress in the log panel

**Training Tips**:
- **GPU users**:
- Use batch size 32-64 for optimal speed
- Enable mixed precision (FP16) - already enabled in code
- **CPU users**:
- Use smaller batch size (8-16) to avoid OOM
- Disable FP16 by editing `app.py`: set `fp16=False` in TrainingArguments
- Consider using smaller datasets or pre-trained models

### 2. Load Model

After training completes or to use a previously trained model:
1. Click the **Load Model** button in the navbar
2. Model will be loaded from `models/final_model/` folder
3. Class labels will be automatically loaded

### 3. Evaluate Model

1. Open the **Evaluate** page
2. Select evaluation dataset folder (same structure as training dataset)
3. Click **Start Evaluation**
4. View evaluation results:
- Confusion Matrix
- Precision, Recall, F1-Score per class
- Overall Accuracy

### 4. Classify Image

1. Ensure model is loaded
2. Open the **Home** page
3. Upload the image you want to classify
4. Click **Classify**
5. View results:
- Predicted class
- Confidence score
- ELA visualization
- Probabilities for all classes

## 🔧 API Endpoints

### Training
- `POST /train` - Start training model
- `GET /api/training_progress` - Get training progress

### Evaluation
- `POST /evaluate` - Start evaluation
- `GET /api/evaluation_progress` - Get evaluation progress

### Inference
- `POST /classify` - Classify single image
- `GET /load_model` - Load trained model
- `GET /unload_model` - Unload model from memory

## 🔒 Security

- Max upload size: 2GB
- File validation for image formats
- Automatic temporary file cleanup
- Session-based processing

## 📝 License

This project is licensed under the **MIT License** - see the [LICENSE](LICENSE) file for details.

## 📞 Support

For questions or issues, please open an issue in this repository.

## 🙏 Acknowledgments

- [Hugging Face Transformers](https://huggingface.co/transformers/)
- [PyTorch](https://pytorch.org/)
- [Vision Transformer Paper](https://arxiv.org/abs/2010.11929)
- Error Level Analysis for digital forensics

---

## 👥 Authors & Contributors

This project was developed as an undergraduate thesis by:

| Name |GitHub |
|------|--------|
| Muhammad Reza Mahendra Laiya |[@Kyovens](https://github.com/Kyovens) |
| Bayu Arma Praja |[@Yuu18id](https://github.com/Yuu18id) |
| Yusra Budiman Hasibuan |[@yusrabudiman](https://github.com/yusrabudiman) |

**Disclaimer:** This is an academic project developed as part of an undergraduate thesis requirement. The software is provided "as-is" without warranty of any kind. The authors and Universitas Mikroskil are not liable for any damages or issues arising from the use of this software.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/yuu18id/vit-ela-ai-image-classifier

Awesome Lists containing this project

README