{"id":25135029,"url":"https://github.com/headless-start/data-augmentation-impact","last_synced_at":"2026-04-05T23:34:13.786Z","repository":{"id":271890708,"uuid":"914894967","full_name":"headless-start/data-augmentation-impact","owner":"headless-start","description":"This repository contains effect of Data Augmentation of Training Set during Model Training.","archived":false,"fork":false,"pushed_at":"2025-02-01T07:50:30.000Z","size":2482,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-03T02:14:50.841Z","etag":null,"topics":["augmented-images","cuda","data","gpu","keras","matplotlib","mnist","opencv-python","python3","tensorflow","training-data"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/headless-start.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-01-10T14:28:23.000Z","updated_at":"2025-03-16T23:20:59.000Z","dependencies_parsed_at":"2025-01-10T15:36:33.746Z","dependency_job_id":"9d83ae3b-7cbc-4ca0-9d52-31a0d41bd021","html_url":"https://github.com/headless-start/data-augmentation-impact","commit_stats":null,"previous_names":["headless-start/data_augmentation_image","headless-start/data-augmentation-impact"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/headless-start/data-augmentation-impact","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/headless-start%2Fdata-augmentation-impact","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/headless-start%2Fdata-augmentation-impact/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/headless-start%2Fdata-augmentation-impact/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/headless-start%2Fdata-augmentation-impact/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/headless-start","download_url":"https://codeload.github.com/headless-start/data-augmentation-impact/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/headless-start%2Fdata-augmentation-impact/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31454199,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-05T21:22:52.476Z","status":"ssl_error","status_checked_at":"2026-04-05T21:22:51.943Z","response_time":75,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["augmented-images","cuda","data","gpu","keras","matplotlib","mnist","opencv-python","python3","tensorflow","training-data"],"created_at":"2025-02-08T16:17:43.218Z","updated_at":"2026-04-05T23:34:13.769Z","avatar_url":"https://github.com/headless-start.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Image Augmentation with TensorFlow  \n\n## 📌 Project Overview  \nThis project demonstrates the impact of **image augmentation techniques** on model performance by training a neural network on the MNIST dataset. Key comparisons include model accuracy and generalization with/without augmentation.  \n\n**Dataset**: MNIST.  \n**Goal**: Evaluate how augmentation improves robustness and reduces overfitting in general Image classification tasks.\n\n---\n\n## 🚀 Key Features  \n1. **Image Augmentation Pipeline**:  \n   - Adjustments: Horizontal flipping, grayscale conversion, saturation, brightness, rotation, and cropping.  \n   - Real-time augmentation using TensorFlow’s `tf.image` module.  \n2. **Optimized Dataset Preparation**:  \n   - Normalization (`[0, 255]` → `[0, 1]`), caching, shuffling, and prefetching for GPU efficiency.  \n3. **Deep Learning Model**:  \n   - Architecture: 2 hidden layers (4096 neurons each, ReLU activation), output layer (10 neurons, softmax).  \n   - Trained separately on augmented vs. raw data for performance comparison.  \n\n---\n\n## 🔍 Findings  \n- **Augmented Model**:  \n  - **Accuracy**: 94.2% (train) vs. 95.8% (test)  \n  - **Runtime**: 3s/epoch | **Memory**: 4GB (NVIDIA GPU).  \n- **Baseline (No Augmentation)**:  \n  - **Accuracy**: 99.1% (train) vs. 94.4% (test) \n  - **Runtime**: 3s/epoch | **Memory**: 3.8GB (NVIDIA GPU). \n- **Conclusion**:  \n  - Augmentation improved test generalization by 1.4% while adding minimal computational overhead.  \n\n---\n\n## 🛠 System Requirements  \n### Dependencies  \n- Python 3.8+  \n- Libraries: `tensorflow`, `tensorflow-datasets`, `matplotlib`, `Pillow`  \n- Hardware: GPU with cuDNN support (recommended)\n\n---\n\n## 📄 License  \nThis project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fheadless-start%2Fdata-augmentation-impact","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fheadless-start%2Fdata-augmentation-impact","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fheadless-start%2Fdata-augmentation-impact/lists"}