Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/headless-start/data-augmentation-impact
This repository contains effect of Data Augmentation of Training Set during Model Training.
https://github.com/headless-start/data-augmentation-impact
augmented-images cuda data gpu keras matplotlib mnist opencv-python python3 tensorflow training-data
Last synced: 3 days ago
JSON representation
This repository contains effect of Data Augmentation of Training Set during Model Training.
- Host: GitHub
- URL: https://github.com/headless-start/data-augmentation-impact
- Owner: headless-start
- License: mit
- Created: 2025-01-10T14:28:23.000Z (about 1 month ago)
- Default Branch: main
- Last Pushed: 2025-02-01T07:50:30.000Z (11 days ago)
- Last Synced: 2025-02-08T16:17:41.822Z (3 days ago)
- Topics: augmented-images, cuda, data, gpu, keras, matplotlib, mnist, opencv-python, python3, tensorflow, training-data
- Language: Jupyter Notebook
- Homepage:
- Size: 2.37 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Image Augmentation with TensorFlow
## π Project Overview
This project demonstrates the impact of **image augmentation techniques** on model performance by training a neural network on the MNIST dataset. Key comparisons include model accuracy and generalization with/without augmentation.**Dataset**: MNIST.
**Goal**: Evaluate how augmentation improves robustness and reduces overfitting in general Image classification tasks.---
## π Key Features
1. **Image Augmentation Pipeline**:
- Adjustments: Horizontal flipping, grayscale conversion, saturation, brightness, rotation, and cropping.
- Real-time augmentation using TensorFlowβs `tf.image` module.
2. **Optimized Dataset Preparation**:
- Normalization (`[0, 255]` β `[0, 1]`), caching, shuffling, and prefetching for GPU efficiency.
3. **Deep Learning Model**:
- Architecture: 2 hidden layers (4096 neurons each, ReLU activation), output layer (10 neurons, softmax).
- Trained separately on augmented vs. raw data for performance comparison.---
## π Findings
- **Augmented Model**:
- **Accuracy**: 94.2% (train) vs. 95.8% (test)
- **Runtime**: 3s/epoch | **Memory**: 4GB (NVIDIA GPU).
- **Baseline (No Augmentation)**:
- **Accuracy**: 99.1% (train) vs. 94.4% (test)
- **Runtime**: 3s/epoch | **Memory**: 3.8GB (NVIDIA GPU).
- **Conclusion**:
- Augmentation improved test generalization by 1.4% while adding minimal computational overhead.---
## π System Requirements
### Dependencies
- Python 3.8+
- Libraries: `tensorflow`, `tensorflow-datasets`, `matplotlib`, `Pillow`
- Hardware: GPU with cuDNN support (recommended)---
## π License
This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.