https://github.com/amiot99/gpu-inference-microservice
FastAPI-based microservice for GPU-accelerated image classification using ResNet18 and Docker
https://github.com/amiot99/gpu-inference-microservice
docker fastapi gpu image-classification inference machine-learning microservice pytorch resnet
Last synced: 3 months ago
JSON representation
FastAPI-based microservice for GPU-accelerated image classification using ResNet18 and Docker
- Host: GitHub
- URL: https://github.com/amiot99/gpu-inference-microservice
- Owner: amiot99
- Created: 2025-04-30T09:08:39.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-05-08T00:03:39.000Z (about 1 year ago)
- Last Synced: 2025-05-12T08:07:02.750Z (about 1 year ago)
- Topics: docker, fastapi, gpu, image-classification, inference, machine-learning, microservice, pytorch, resnet
- Language: Python
- Homepage:
- Size: 34.2 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
GPU Inference Microservice with FastAPI + PyTorch
This project is a minimal, production-style microservice for GPU-accelerated AI/ML image classification.
It uses pre-trained models (ResNet18 and ResNet50) from PyTorch, wrapped in a FastAPI server, and containerized with Docker.
Additionally, it includes Kubernetes deployment configurations for easy scaling and orchestration.
What It Does
- Accepts image uploads via a `/predict` endpoint
- Runs inference using a GPU-enabled ResNet model (user can select either ResNet18 or ResNet50)
- Returns the predicted class label (e.g., “golden retriever”)
- Deployable via Docker and Kubernetes
Tech Stack
- PyTorch (ResNet18, pretrained on ImageNet)
- FastAPI for the web server
- Docker for containerization
- Kubernetes for orchestration and scaling
- Uvicorn for serving the API
Run It Yourself:
1. Clone the repo
```bash
git clone https://github.com/your-username/gpu-inference-microservice.git
cd gpu-inference-microservice
```
2. Build the Docker image - [Docker](https://www.docker.com/)
```
docker build -t gpu-inference-app .
```
3. Run the container
```
docker run -p 8000:8000 gpu-inference-app
```
4. Use the API
```
Visit http://localhost:8000/docs
Upload an image → get the predicted label
```
Running on GPU (Optional)
```
To enable GPU acceleration with NVIDIA GPUs, you'll need:
* NVIDIA GPU drivers
* NVIDIA Container Toolkit
* Docker with GPU support
Once set up, use this command to run the container using your GPU:
docker run --gpus all -p 8000:8000 gpu-inference-app
```
Deploying with Kubernetes (Optional)
If you want to deploy the application using Kubernetes:
1. Ensure Minikube is installed:
[Install Minikube](https://minikube.sigs.k8s.io/docs/start/?arch=%2Fwindows%2Fx86-64%2Fstable%2F.exe+download)
2. Start Minikube:
```
minikube start --driver=docker
```
3. Build the docker image inside Minikube:
```
eval $(minikube docker-env)
docker build -t gpu-inference-app:latest .
```
4. Apply the Kubernetes Manifests:
```
kubectl apply -f k8s/
```
5. Verify the Deployments:
```
kubectl get pods
kubectl get svc
```
6. Access the Application:
```
minikube service gpu-inference-service
```
Troubleshooting:
Issue: WSL2 disk space not reclaimed after deleting docker containers/images
If you notice that the disk space in WSL2 is not being reclaimed after deleting Docker containers and images, follow
these steps to manually compact the disk:
1. Shut down Docker and WSL2:
```
wsl -shutdown
```
2. Locate the VHDX File:
Navigate to the WSL2 virtual disk file. The default path is:
```
C:\Users\{username}\AppData\Local\Docker\wsl\disk\docker_data.vhdx
```
3. Compact the Disk:
Open PowerShell as Administrator and run:
```
diskpart
```
In the diskpart prompt, enter the following commands:
```
select vdisk file="C:\Users\{username}\AppData\Local\Docker\wsl\disk\docker_data.vhdx"
attach vdisk readonly
compact vdisk
detach vdisk
exit
```
4. Restart Docker Desktop