Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/bjornmelin/deep-learning-evolution

🧠 Deep-Learning Evolution: Unified collection of TensorFlow & PyTorch projects, featuring custom CUDA kernels, distributed training, memory‑efficient methods, and production‑ready pipelines. Showcases advanced GPU optimizations, from foundational models to cutting‑edge architectures. 🚀
https://github.com/bjornmelin/deep-learning-evolution

ai-research cuda data-science deep-learning distributed-training gan gpu-acceleration machine-learning model-optimization neural-networks python pytorch tensorflow training-pipeline transformers

Last synced: 8 days ago
JSON representation

🧠 Deep-Learning Evolution: Unified collection of TensorFlow & PyTorch projects, featuring custom CUDA kernels, distributed training, memory‑efficient methods, and production‑ready pipelines. Showcases advanced GPU optimizations, from foundational models to cutting‑edge architectures. 🚀

Awesome Lists containing this project

README

        

# Deep Learning Evolution 🚀

[![Python](https://img.shields.io/badge/python-3.8%2B-blue.svg)](https://www.python.org/downloads/)
[![TensorFlow](https://img.shields.io/badge/tensorflow-2.14%2B-orange.svg)](https://tensorflow.org/)
[![PyTorch](https://img.shields.io/badge/pytorch-2.2%2B-red.svg)](https://pytorch.org/)
[![CUDA](https://img.shields.io/badge/cuda-11.8%2B-green.svg)](https://developer.nvidia.com/cuda-toolkit)
[![License](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)
[![Contributions Welcome](https://img.shields.io/badge/contributions-welcome-brightgreen.svg?style=flat)](CONTRIBUTING.md)

> **Deep-Learning Evolution** combines both **TensorFlow** and **PyTorch** projects into a single, cohesive repository. From foundational neural networks to custom CUDA kernels and distributed training systems, you’ll find memory‑efficient training techniques, advanced GPU optimizations, and production‑ready deployments for real‑world AI applications.

---

## 📑 Table of Contents
- [✨ Features](#-features)
- [📁 Project Structure](#-project-structure)
- [🔧 Prerequisites](#-prerequisites)
- [📦 Installation](#-installation)
- [🚀 Quick Start](#-quick-start)
- [TensorFlow Examples](#tensorflow-examples)
- [PyTorch Examples](#pytorch-examples)
- [📚 Documentation](#-documentation)
- [Models](#models)
- [GPU Optimization](#gpu-optimization)
- [Benchmarks](#benchmarks)
- [🤝 Contributing](#-contributing)
- [📌 Versioning](#-versioning)
- [✍️ Author](#-author)
- [📝 Citation](#-citation)
- [📄 License](#-license)
- [🙏 Acknowledgments](#-acknowledgments)

---

## ✨ Features

1. **Comprehensive Framework Coverage**
- TensorFlow 2.14+ and PyTorch 2.2+ examples under one roof.

2. **Custom Training Pipelines**
- Efficient data loading, augmentation, and custom training loops.

3. **Advanced GPU & CUDA Optimizations**
- Memory‑efficient backprop, kernel fusion, and multi‑GPU data parallelism.

4. **Distributed Training**
- PyTorch’s `DistributedTrainer` and TensorFlow’s MirroredStrategy for large‑scale HPC setups.

5. **Production‑Ready Models**
- Demonstrations of how to package and deploy models with Docker, Streamlit, or other tools.

6. **Comprehensive Documentation & Tests**
- Detailed notebooks, examples, and unit tests in both frameworks.

---

## 📁 Project Structure

Depending on your final layout, you might organize your repo as follows:

```mermaid
graph TD
A[deep-learning-evolution] --> B[tensorflow]
A --> C[pytorch]
A --> D[common]
A --> E[benchmarks]
A --> F[tests]
B --> G[notebooks]
B --> H[src]
B --> I[apps]
C --> J[examples]
C --> K[src]
C --> L[extensions]
D --> M[utils]
D --> N[datasets]
D --> O[docs]
```

Click to expand possible directory structure

```plaintext
deep-learning-evolution/
├── tensorflow/
│ ├── notebooks/ # Jupyter notebooks for TF experiments
│ ├── src/ # TF source code (models, data, utils, cuda)
│ ├── apps/ # Demo or production apps (e.g., Streamlit)
│ └── ...
├── pytorch/
│ ├── examples/ # PyTorch example scripts
│ ├── src/ # PyTorch source (models, training utils)
│ ├── extensions/ # Custom CUDA kernels/extensions
│ └── ...
├── common/
│ ├── utils/ # Shared utility scripts
│ ├── datasets/ # Shared data loaders or references
│ └── docs/ # Additional documentation
├── benchmarks/ # Performance comparisons across frameworks
├── tests/ # Unit tests for TF and PT code
├── requirements.txt # Dependencies
└── README.md # Main documentation
```

**Tip**: You can keep separate `requirements-tf.txt` and `requirements-pt.txt` if you prefer splitting dependencies.

---

## 🔧 Prerequisites

- **Python** 3.8+
- **CUDA** 11.8+
- **TensorFlow** 2.14+
- **PyTorch** 2.2+
- **NVIDIA GPU** with compute capability 6.0+
- **(Optional)** C++ compiler for building custom CUDA extensions (e.g., PyTorch’s extension modules)

---

## 📦 Installation

1. **Clone this repository:**

```bash
git clone https://github.com/BjornMelin/deep-learning-evolution.git
cd deep-learning-evolution
```

2. **Set up a virtual environment:**

```bash
python -m venv venv
source venv/bin/activate # Linux/Mac
# or
.\venv\Scripts\activate # Windows
```

3. **Install dependencies:**

```bash
# Single unified requirements:
pip install -r requirements.txt

# OR separate for each framework:
pip install -r requirements-tf.txt
pip install -r requirements-pt.txt
```

4. **(Optional) Build custom CUDA extensions (for PyTorch examples):**

```bash
cd pytorch/extensions/cuda
python setup.py install
```

---

## 🚀 Quick Start

### TensorFlow Examples

```python
# TensorFlow: Basic CNN Example
from tensorflow.src.models import BasicCNN
from tensorflow.src.utils import DataLoader

data_loader = DataLoader('path/to/data')
model = BasicCNN()

model.train(data_loader, epochs=10)
model.evaluate(...)
model.save('saved_model')
```

### PyTorch Examples

```python
# PyTorch: Custom Transformer Example
from pytorch.src import models, training

model = models.TransformerWithCustomAttention(num_heads=8, d_model=512)
trainer = training.DistributedTrainer(model, gpu_list=[0,1])
trainer.train(dataset, epochs=10)
```

> For more demos, see the [notebooks](./tensorflow/notebooks) in `tensorflow/` or the [examples](./pytorch/examples) folder in `pytorch/`.

---

## 📚 Documentation

### Models

| Model | Framework | Description | Performance | Memory Efficiency |
|-----------------------|----------------|---------------------------------------------|--------------------|-------------------|
| `BasicCNN` | TensorFlow | Foundational CNN architecture | 94% on MNIST | High |
| `TransformerBlock` | TensorFlow | Custom transformer for NLP | BERT-comparable | Medium |
| `AdvancedGAN` | TensorFlow | Generative adversarial network | FID: 18.3 | High |
| `TransformerWithCustomAttention` | PyTorch | Advanced Transformer w/ fused attention | 95% on GLUE | High |
| `ViT` | PyTorch | Vision transformer with optimized attention | 92% on ImageNet | Medium |
| `CustomGAN` | PyTorch | Memory‑efficient adversarial network | FID: 12.4 | High |

---

### GPU Optimization

- **Custom CUDA kernels** for specialized operations
- **Memory‑efficient training** (gradient checkpointing, fused ops)
- **Multi‑GPU data parallelism** (PyTorch DDP, TF MirroredStrategy)
- **Distributed training** for large‑scale HPC
- **Kernel fusion techniques** to minimize overhead

---

### Benchmarks

| Model | Hardware | Framework | Training Time | Memory Usage | Throughput |
|-----------------------|---------|------------|---------------|-------------|-------------------|
| Custom BERT | A100 | PyTorch | 4.2 hrs | 16GB | 1250 samples/sec |
| ViT | V100 | PyTorch | 6.8 hrs | 24GB | 850 samples/sec |
| ResNet-152 | 3090 | PyTorch | 2.5 hrs | 12GB | 2100 samples/sec |
| BasicCNN (small) | 4GB GPU | TensorFlow | 2.3 hrs | ~2GB | 900 samples/sec |
| TransformerBlock | 8GB GPU | TensorFlow | 5.7 hrs | ~6GB | 600 samples/sec |

(Exact numbers are illustrative; refer to each framework’s benchmarks folder for detailed logs.)

---

## 🤝 Contributing

We welcome all contributions, from bug fixes to new model architectures and GPU kernels.

- [Contributing Guidelines](CONTRIBUTING.md)
- [Code of Conduct](CODE_OF_CONDUCT.md)
- [Development Guide](DEVELOPMENT.md)

1. **Fork** this repository
2. **Create** a new branch
3. **Commit** your changes
4. **Open** a Pull Request

---

## 📌 Versioning

We use [SemVer](http://semver.org/) for versioning. For available versions, check the [tags](https://github.com/BjornMelin/deep-learning-evolution/tags).

---

## ✍️ Author

**Bjorn Melin**
- GitHub: [@BjornMelin](https://github.com/BjornMelin)
- LinkedIn: [Bjorn Melin](https://linkedin.com/in/bjorn-melin)

Feel free to add additional contributors or maintainers here.

---

## 📝 Citation

If you find this project useful, please cite:

```bibtex
@misc{melin2024deeplearningevolution,
author = {Melin, Bjorn},
title = {Deep Learning Evolution: Unified TensorFlow & PyTorch Implementations},
year = {2024},
publisher = {GitHub},
url = {https://github.com/BjornMelin/deep-learning-evolution}
}
```

---

## 📄 License

This project is licensed under the **MIT License**. See the [LICENSE](LICENSE) file for details.

---

## 🙏 Acknowledgments

- **TensorFlow** team for extensive documentation and tooling.
- **PyTorch** contributors for ongoing innovation.
- **NVIDIA** for the CUDA toolkit.
- **Open Source Community** for valuable feedback and inspiration.

---

**Made with 🧠 and ⚡ by Bjorn Melin**