Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/reyhaneh-saffar/vision-transformer-for-cifar-10

Evaluating the performance of Vision Transformers (ViT) and pre-trained Convolutional Neural Networks (CNNs) on the CIFAR-10 dataset
https://github.com/reyhaneh-saffar/vision-transformer-for-cifar-10

Last synced: 16 days ago
JSON representation

Evaluating the performance of Vision Transformers (ViT) and pre-trained Convolutional Neural Networks (CNNs) on the CIFAR-10 dataset

Host: GitHub
URL: https://github.com/reyhaneh-saffar/vision-transformer-for-cifar-10
Owner: reyhaneh-saffar
Created: 2025-01-11T20:13:31.000Z (19 days ago)
Default Branch: main
Last Pushed: 2025-01-11T20:19:53.000Z (19 days ago)
Last Synced: 2025-01-11T21:26:12.987Z (19 days ago)
Language: Jupyter Notebook
Size: 350 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Comparison of Vision Transformer (ViT) and Pre-Trained Models for Image Classification

## Introduction
This project evaluates the performance of Vision Transformers (ViT) and pre-trained Convolutional Neural Networks (CNNs) on the CIFAR-10 dataset. The objective is to compare their generalization capabilities and effectiveness in image classification.

---

### **Generalization Capability**
- **Pre-trained ViT** achieved the highest test accuracy of **96.10%**, significantly outperforming the custom ViT (**62.17%**) and ResNet18 (**89.34%**).
- Highlights the effectiveness of pre-trained transformers, especially for shorter training durations.

### **Performance Comparison**
- **Pre-trained ViT** showed superior results across all metrics.
- **ResNet18** delivered strong performance (**89.34% accuracy**) but was outpaced by the pre-trained ViT.
- **Custom ViT's lower accuracy** underscores the critical role of pre-training in transformers.

### **Training Efficiency**
- Pre-trained models converged in just **3 epochs**, compared to the custom ViT's **20 epochs**.
- This demonstrates the efficiency and practicality of transfer learning.

### **Generalization vs Overfitting**
- **ResNet18** exhibited signs of overfitting with strong training performance but relatively lower test accuracy.
- The **pre-trained ViT** maintained high test accuracy, showcasing robustness on unseen data.

---

### Model Test Accuracy:
- **Custom ViT:** 62.17%
- **Pre-trained ViT:** 96.10%
- **ResNet18:** 89.34%