https://github.com/persteenolsen/fastapi-jwt-auth-dl-four
Python FastAPI ML Inference Service with ONNX Runtime and PyTorch-Trained Model for House Price Prediction using Ames Dataset focusing on Tests (v8)
https://github.com/persteenolsen/fastapi-jwt-auth-dl-four
deep-learning fastapi jwt onnx python pytorch tests
Last synced: 9 days ago
JSON representation
Python FastAPI ML Inference Service with ONNX Runtime and PyTorch-Trained Model for House Price Prediction using Ames Dataset focusing on Tests (v8)
- Host: GitHub
- URL: https://github.com/persteenolsen/fastapi-jwt-auth-dl-four
- Owner: persteenolsen
- Created: 2026-05-05T14:08:47.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2026-05-26T10:25:08.000Z (about 1 month ago)
- Last Synced: 2026-05-26T12:27:42.267Z (about 1 month ago)
- Topics: deep-learning, fastapi, jwt, onnx, python, pytorch, tests
- Language: Python
- Homepage: https://fastapi-jwt-auth-dl-four.vercel.app/docs
- Size: 222 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# π v8 - House Price Prediction API (FastAPI + PyTorch + JWT + Ames Dataset + Tests + Vue 3 SPA)
Last updated
- 09-06-2026
A production-style machine learning backend system that predicts house prices using a PyTorch neural network trained on the Ames Housing dataset and served through a secure FastAPI API with JWT authentication
This project demonstrates a full ML engineering pipeline: data preprocessing β feature engineering β model training β Tests β ONNX export β secure API inference
# Vue 3 frontend for the Web API
- [`The Vue 3 SPA at GitHub`](https://github.com/persteenolsen/vue-fastapi-jwt-auth-dl-four) - The Vue 3 SPA using JWT Authentication
# π§ͺ Tests
This project includes four test scripts to validate model predictions and overall quality
Note: For running a test inside the folder "tests" use the command:
python -m tests.testone
### testone.py β Manual PyTorch vs ONNX Comparison
Compares the PyTorch model predictions with the ONNX exported model on a set of hand-picked inputs.
Gr_Liv_Area=900 -> PyTorch: $180,157, ONNX: $180,157, diff=0.167
Gr_Liv_Area=1000 -> PyTorch: $184,697, ONNX: $184,697, diff=0.178
Gr_Liv_Area=1100 -> PyTorch: $189,351, ONNX: $189,351, diff=0.181
Gr_Liv_Area=1200 -> PyTorch: $194,122, ONNX: $194,122, diff=0.186
### testtwo.py β Predictions for Custom Inputs
Tests specific example inputs to verify ONNX model alignment with PyTorch outputs.
Example 1: PyTorch=$184,697, ONNX=$184,697, diff=0.178
Example 2: PyTorch=$257,907, ONNX=$257,907, diff=0.004
### testthree.py β Train/Test Loss Check
Evaluates the modelβs loss on train and test sets to confirm proper fitting.
Train Loss: 0.0471
Test Loss: 0.0531
β
Model fit looks good
### testfour.py β Top Prediction Errors Analysis
Identifies the largest errors in the test set and calculates overall prediction accuracy.
Top 5 largest prediction errors:
Predicted=$405,438, Actual=$118,500, Error=$286,938
Predicted=$342,997, Actual=$124,000, Error=$218,997
Predicted=$231,987, Actual=$35,311, Error=$196,676
Predicted=$248,036, Actual=$415,000, Error=$166,964
Predicted=$281,822, Actual=$138,887, Error=$142,935
Average percentage error: 17.71%
β οΈ 66 houses have >30% prediction error
These tests ensure model consistency, alignment between PyTorch and ONNX predictions, training quality, and edge-case performance.
# π Refactoring of train.py
- train.py was refactored into classes and modules using the new files for tests
# π Features
- End-to-end ML pipeline using real-world Ames Housing dataset
- Feature engineering (HouseAge, HasGarage derived features)
- Robust feature alignment between training and inference
- Input normalization for stable neural network training
- PyTorch neural network regression model
- Log-transformed target (log1p) for stable regression learning
- ONNX export for fast production inference
- Secure REST API using FastAPI
- JWT authentication (OAuth2 password flow)
- Named-feature JSON input (no manual ordering required)
- Consistent preprocessing across training and inference
- Serverless-ready deployment design (Vercel compatible)
# π§± Tech Stack
- Python 3.12
- FastAPI
- PyTorch
- ONNX Runtime
- NumPy
- Pandas
- scikit-learn
- python-jose (JWT auth)
- Uvicorn
- joblib
# π Project Structure
```
.
βββ main.py # FastAPI inference API (JWT + ONNX)
βββ train.py # Model training + ONNX export
βββ features.py # Shared feature definitions + transforms
βββ AmesHousing.csv # Dataset
βββ model.onnx # Exported inference model
βββ mean.npy # Feature normalization mean
βββ std.npy # Feature normalization std
βββ requirements.txt
βββ .env # Environment variables (not committed)
βββ tests # Tests
```
# βοΈ Installation
git clone https://github.com/your-repo/house-price-api.git
cd house-price-api
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements/dev.txt
Verify setup:
python -c "import fastapi, numpy, onnxruntime; print('VENV OK')"
Expected output:
VENV OK
# π Environment Variables (.env)
SECRET_KEY=your_secret_key
ALGORITHM=HS256
ACCESS_TOKEN_EXPIRE_MINUTES=60
ADMIN_USERNAME=admin
ADMIN_PASSWORD=password
# ποΈ Training the Model
python train.py
Outputs:
- model.onnx
- mean.npy
- std.npy
# π§ Model Tuning
- Removed a hidden layer
- Reduced the hidden layer size (16 β 4 neurons)
- Lowered learning rate (0.01 β 0.006)
- Increased training epochs (500 β 1000)
- Added weight decay (L2 regularization)
- Introduced early stopping for training stability
- Improved numerical stability in normalization
Result:
- Smooth, monotonic price curves
- Stable age depreciation behavior
- More consistent size scaling
- Reduced high-range prediction jumps
- Better generalization without overfitting
# π¬ Training Pipeline
- Load Ames Housing dataset
- Feature engineering: HouseAge (derived), HasGarage (binary feature)
- Select final feature set
- Normalize features (mean/std scaling)
- Train PyTorch neural network
- Optimize regression using log1p target
- Export model to ONNX format
- Save normalization parameters for inference
# π Run API Locally
uvicorn main:app --reload
Swagger UI:
http://127.0.0.1:8000/docs
# π Authentication
POST `/login`
username=admin
password=password
Response:
{
"access_token": "jwt_token_here",
"token_type": "bearer"
}
Use Token:
Authorization: Bearer
# π‘ Prediction Endpoint
POST `/predict`
Example:
{
"Gr_Liv_Area": 1500,
"Overall_Qual": 7,
"Year_Built": 2005,
"Garage_Cars": 2,
"Full_Bath": 2,
"Bedroom_AbvGr": 3,
"Lot_Area": 8000
}
Response:
{
"predicted_price": 209169.703125
}
Another example:
{
"Gr_Liv_Area": 1200,
"Overall_Qual": 7,
"Year_Built": 2005,
"Garage_Cars": 2,
"Full_Bath": 2,
"Bedroom_AbvGr": 3,
"Lot_Area": 8000
}
Response:
{
"predicted_price": 194122.25
}
# Clamping Predicted Price
- Maximum value: $755,000
- Reverse log-transform, clamp values above cap
# π§ Key Design Feature
- Named-feature system (safe ML design)
- No manual feature ordering in API
- Centralized feature definition (`features.py`)
- Automatic transformation + alignment
- Prevents silent prediction errors
# π Training vs Inference Consistency
- Training: defines feature space, applies normalization, trains PyTorch, exports ONNX + normalization params
- Inference: receives JSON, applies same transforms, normalization, runs ONNX, returns prediction
# π§ͺ System Flow
1. User logs in β receives JWT
2. Sends house feature JSON
3. API verifies JWT, builds feature vector, applies normalization, runs ONNX inference, returns predicted price
# π§ What this project demonstrates
- Production-style ML system architecture
- Feature engineering for tabular regression
- Train/inference consistency design
- Secure API authentication (JWT)
- ONNX-based model deployment
- Serverless-compatible ML inference design
# π¨βπ» Author
Built as a machine learning engineering project demonstrating production-style ML system design using FastAPI, PyTorch, and ONNX for real-world deployment scenarios.