https://github.com/amiot99/gpu-inference-microservice

FastAPI-based microservice for GPU-accelerated image classification using ResNet18 and Docker
https://github.com/amiot99/gpu-inference-microservice

docker fastapi gpu image-classification inference machine-learning microservice pytorch resnet

Last synced: 3 months ago
JSON representation

FastAPI-based microservice for GPU-accelerated image classification using ResNet18 and Docker

Host: GitHub
URL: https://github.com/amiot99/gpu-inference-microservice
Owner: amiot99
Created: 2025-04-30T09:08:39.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2025-05-08T00:03:39.000Z (about 1 year ago)
Last Synced: 2025-05-12T08:07:02.750Z (about 1 year ago)
Topics: docker, fastapi, gpu, image-classification, inference, machine-learning, microservice, pytorch, resnet
Language: Python
Homepage:
Size: 34.2 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

GPU Inference Microservice with FastAPI + PyTorch

This project is a minimal, production-style microservice for GPU-accelerated AI/ML image classification.
It uses pre-trained models (ResNet18 and ResNet50) from PyTorch, wrapped in a FastAPI server, and containerized with Docker.
Additionally, it includes Kubernetes deployment configurations for easy scaling and orchestration.

What It Does

- Accepts image uploads via a `/predict` endpoint
- Runs inference using a GPU-enabled ResNet model (user can select either ResNet18 or ResNet50)
- Returns the predicted class label (e.g., “golden retriever”)
- Deployable via Docker and Kubernetes

Tech Stack
- PyTorch (ResNet18, pretrained on ImageNet)
- FastAPI for the web server
- Docker for containerization
- Kubernetes for orchestration and scaling
- Uvicorn for serving the API

Run It Yourself:

1. Clone the repo
```bash
git clone https://github.com/your-username/gpu-inference-microservice.git
cd gpu-inference-microservice
```

2. Build the Docker image - [Docker](https://www.docker.com/)
```
docker build -t gpu-inference-app .
```
3. Run the container
```
docker run -p 8000:8000 gpu-inference-app
```
4. Use the API
```
Visit http://localhost:8000/docs
Upload an image → get the predicted label
```

Running on GPU (Optional)
```
To enable GPU acceleration with NVIDIA GPUs, you'll need:
* NVIDIA GPU drivers
* NVIDIA Container Toolkit
* Docker with GPU support

Once set up, use this command to run the container using your GPU:

docker run --gpus all -p 8000:8000 gpu-inference-app
```
Deploying with Kubernetes (Optional)
If you want to deploy the application using Kubernetes:

1. Ensure Minikube is installed:
[Install Minikube](https://minikube.sigs.k8s.io/docs/start/?arch=%2Fwindows%2Fx86-64%2Fstable%2F.exe+download)
2. Start Minikube:
```
minikube start --driver=docker
```
3. Build the docker image inside Minikube:
```
eval $(minikube docker-env)
docker build -t gpu-inference-app:latest .
```
4. Apply the Kubernetes Manifests:
```
kubectl apply -f k8s/
```
5. Verify the Deployments:
```
kubectl get pods
kubectl get svc
```
6. Access the Application:
```
minikube service gpu-inference-service
```

Troubleshooting:

Issue: WSL2 disk space not reclaimed after deleting docker containers/images

If you notice that the disk space in WSL2 is not being reclaimed after deleting Docker containers and images, follow
these steps to manually compact the disk:

1. Shut down Docker and WSL2:
```
wsl -shutdown
```
2. Locate the VHDX File:
Navigate to the WSL2 virtual disk file. The default path is:
```
C:\Users\{username}\AppData\Local\Docker\wsl\disk\docker_data.vhdx
```
3. Compact the Disk:
Open PowerShell as Administrator and run:
```
diskpart
```
In the diskpart prompt, enter the following commands:
```
select vdisk file="C:\Users\{username}\AppData\Local\Docker\wsl\disk\docker_data.vhdx"
attach vdisk readonly
compact vdisk
detach vdisk
exit
```
4. Restart Docker Desktop

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/amiot99/gpu-inference-microservice

Awesome Lists containing this project

README