https://github.com/persteenolsen/fastapi-jwt-auth-dl-three
Python FastAPI ML Inference Service with ONNX Runtime and PyTorch-Trained Model for House Price Prediction using Ames Dataset (v6)
https://github.com/persteenolsen/fastapi-jwt-auth-dl-three
deep-learning fastapi jwt onnx python
Last synced: 9 days ago
JSON representation
Python FastAPI ML Inference Service with ONNX Runtime and PyTorch-Trained Model for House Price Prediction using Ames Dataset (v6)
- Host: GitHub
- URL: https://github.com/persteenolsen/fastapi-jwt-auth-dl-three
- Owner: persteenolsen
- Created: 2026-04-28T14:51:07.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2026-04-28T16:41:25.000Z (about 2 months ago)
- Last Synced: 2026-04-28T17:06:06.654Z (about 2 months ago)
- Topics: deep-learning, fastapi, jwt, onnx, python
- Language: Python
- Homepage: https://fastapi-jwt-auth-dl-three.vercel.app/docs
- Size: 218 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# ๐ v6 - House Price Prediction API (FastAPI + PyTorch + JWT + Ames Dataset)
Last updated
- 05-05-2026
A production-style machine learning backend system that predicts house prices using a PyTorch neural network trained on the Ames Housing dataset and served through a secure FastAPI API with JWT authentication.
This project demonstrates a full ML engineering pipeline: data preprocessing โ feature engineering โ model training โ ONNX export โ secure API inference.
# ๐จโ๐ป Things I learned
- For Ames Housing Dataset which are messy and noisy tabular data using a Neural Network was an interesting experience
- My v7 using Linear Regression performs "better" with Ames Housing Dataset, but both models have their pros and cons
However, I got experience with PyTorch and compared PyToch Neural Network v6 with Linear Regression v7. PyTorch would be a good choice for massive datsets and image classification
Take a look at the section Model Tuning
---
# ๐ Features
- End-to-end ML pipeline using real-world Ames Housing dataset
- Feature engineering (HouseAge, HasGarage derived features)
- Robust feature alignment between training and inference
- Input normalization for stable neural network training
- PyTorch neural network regression model
- Log-transformed target (log1p) for stable regression learning
- ONNX export for fast production inference
- Secure REST API using FastAPI
- JWT authentication (OAuth2 password flow)
- Named-feature JSON input (no manual ordering required)
- Consistent preprocessing across training and inference
- Serverless-ready deployment design (Vercel compatible)
---
# ๐งฑ Tech Stack
- Python 3.12
- FastAPI
- PyTorch
- ONNX Runtime
- NumPy
- Pandas
- scikit-learn
- python-jose (JWT auth)
- Uvicorn
- joblib
---
# ๐ Project Structure
```
.
โโโ main.py # FastAPI inference API (JWT + ONNX)
โโโ train.py # Model training + ONNX export
โโโ features.py # Shared feature definitions + transforms
โโโ AmesHousing.csv # Dataset
โโโ model.onnx # Exported inference model
โโโ mean.npy # Feature normalization mean
โโโ std.npy # Feature normalization std
โโโ requirements.txt
โโโ .env # Environment variables (not committed)
```
---
# โ๏ธ Installation
```bash
git clone https://github.com/your-repo/house-price-api.git
cd house-price-api
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements/dev.txt
```
Verify setup:
```bash
python -c "import fastapi, numpy, onnxruntime; print('VENV OK')"
```
Expected output:
```
VENV OK
```
---
# ๐ Environment Variables (.env)
```
SECRET_KEY=your_secret_key
ALGORITHM=HS256
ACCESS_TOKEN_EXPIRE_MINUTES=60
ADMIN_USERNAME=admin
ADMIN_PASSWORD=password
```
---
# ๐๏ธ Training the Model
```bash
python train.py
```
Outputs:
- model.onnx
- mean.npy
- std.npy
---
## ๐ง Model Tuning
During development, the model was tuned to improve stability and realism of predictions.
Key tuning changes:
- Removed a hidden layer
- Reduced the hidden layer size (16 โ 4 neurons)
- Lowered learning rate (0.01 โ 0.005)
- Increased training epochs (500 โ 1000)
- Added weight decay (L2 regularization)
- Introduced early stopping for training stability
- Improved numerical stability in normalization
Result:
- Smooth, monotonic price curves
- Stable age depreciation behavior
- More consistent size scaling
- Reduced high-range prediction jumps
- Better generalization without overfitting
---
# ๐ฌ Training Pipeline
- Load Ames Housing dataset
- Feature engineering:
- HouseAge (derived)
- HasGarage (binary feature)
- Select final feature set
- Normalize features (mean/std scaling)
- Train PyTorch neural network
- Optimize regression using log1p target
- Export model to ONNX format
- Save normalization parameters for inference
---
# ๐ Run API Locally
```bash
uvicorn main:app --reload
```
Swagger UI:
```
http://127.0.0.1:8000/docs
```
---
# ๐ Authentication
## Login
POST `/login`
```
username=admin
password=password
```
Response:
```json
{
"access_token": "jwt_token_here",
"token_type": "bearer"
}
```
## Use Token
```
Authorization: Bearer
```
---
# ๐ก Prediction Endpoint
POST `/predict`
## Example requests
"Gr_Liv_Area" = 1500
```json
{
"Gr_Liv_Area": 1500,
"Overall_Qual": 7,
"Year_Built": 2005,
"Garage_Cars": 2,
"Full_Bath": 2,
"Bedroom_AbvGr": 3,
"Lot_Area": 8000
}
```
## Response
```json
{
"predicted_price": 216979.625
}
```
"Gr_Liv_Area" = 1200 gives lower price like expected
```json
{
"Gr_Liv_Area": 1200,
"Overall_Qual": 7,
"Year_Built": 2005,
"Garage_Cars": 2,
"Full_Bath": 2,
"Bedroom_AbvGr": 3,
"Lot_Area": 8000
}
```
## Response
```json
{
"predicted_price": 197978.359375
}
```
---
## Clamping Predicted Price
To avoid unrealistic house price predictions, the model clamps the output to a maximum value of **$755,000**. After predicting, the price is reverse log-transformed and if it exceeds the cap, it's set to this maximum value.
Example:
1. Model predicts a price.
2. The price is transformed using `np.expm1()`.
3. If the price exceeds $755,000, it's clamped to that value.
This ensures predictions stay within a realistic range.
---
# ๐ง Key Design Feature
### Named-feature system (safe ML design)
- No manual feature ordering in API
- Centralized feature definition (`features.py`)
- Automatic transformation + alignment
- Prevents silent prediction errors
---
# ๐ Training vs Inference Consistency
## Training:
- defines feature space in `features.py`
- applies normalization (mean/std)
- trains PyTorch model
- exports ONNX + normalization params
## Inference:
- receives named JSON input
- applies same feature transform
- applies identical normalization
- runs ONNX model
- returns real-world prediction
---
# ๐งช System Flow
1. User logs in โ receives JWT
2. Sends house feature JSON
3. API:
- verifies JWT
- builds feature vector
- applies normalization
- runs ONNX inference
- returns predicted price
---
# ๐ง What this project demonstrates
- Production-style ML system architecture
- Feature engineering for tabular regression
- Train/inference consistency design
- Secure API authentication (JWT)
- ONNX-based model deployment
- Serverless-compatible ML inference design
---
# ๐จโ๐ป Author
Built as a machine learning engineering project demonstrating production-style ML system design using FastAPI, PyTorch, and ONNX for real-world deployment scenarios.