https://github.com/neural-bits/production-hub

Hands-on hub to learn techniques to optimize and serve AI models to production the most optimal way.
https://github.com/neural-bits/production-hub

machine-learning optimization production quantization

Last synced: about 1 year ago
JSON representation

Hands-on hub to learn techniques to optimize and serve AI models to production the most optimal way.

Host: GitHub
URL: https://github.com/neural-bits/production-hub
Owner: neural-bits
Created: 2024-08-05T15:54:53.000Z (almost 2 years ago)
Default Branch: main
Last Pushed: 2024-09-11T13:10:51.000Z (almost 2 years ago)
Last Synced: 2025-04-23T13:22:59.866Z (about 1 year ago)
Topics: machine-learning, optimization, production, quantization
Language: Jupyter Notebook
Homepage:
Size: 43.2 MB
Stars: 6
Watchers: 0
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # Neural Bits Production Hub

This repository consists of code and articles on the Neural Bits Newsletter that showcase:

- how to optimize, and quantize models for optimal performance

- efficient model serving in production environments at scale

- 

## Categories

### Model Optimization

|ID| 📝  Article  | 💻 Code | Details | Complexity | Tech Stack |

|--|---------|-----------------|---------|------------|----------------------|

|001| [Inference Engines Profilling](https://neuralbits.substack.com/p/3-inference-engines-for-optimal-throughput)| [Here](https://github.com/neural-bits/production-hub/tree/main/001-inference_engines) | Profile a CNN model across PyTorch, ONNX, TensorRT, and TorchCompile | 🟩🟩⬜ |Python, Jupyter|

### Model Deployment

|ID| 📝  Article | 💻 Code| Details | Complexity | Tech Stack |

|--|---------|------|---------|------------|----------------------|

|002| [Deploying DL models with NVIDIA Triton Inference Server]()| [Here](https://github.com/neural-bits/production-hub/tree/main/002-triton-server-cnn-deployment) | Full tutorial on how to set-up and deploy ML models with Triton Inference Server | 🟩🟩🟩 |Python, Docker, Bash|

### Quantization Techniques

|ID| Article | Code | Details | Complexity | Tech Stack |

|--|---------|------|---------|------------|----------------------|

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/neural-bits/production-hub

Awesome Lists containing this project

README