https://github.com/gana36/flight-price-estimation
Production-ready flight price prediction using ensemble models (RF + XGBoost + LightGBM) with AWS ECS deployment
https://github.com/gana36/flight-price-estimation
aws-ecs docker dvc ensemble-learning fastapi grafana mlflow mlops prometheus
Last synced: 3 months ago
JSON representation
Production-ready flight price prediction using ensemble models (RF + XGBoost + LightGBM) with AWS ECS deployment
- Host: GitHub
- URL: https://github.com/gana36/flight-price-estimation
- Owner: gana36
- Created: 2025-11-18T02:54:07.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2025-12-12T04:03:19.000Z (7 months ago)
- Last Synced: 2025-12-13T11:47:12.688Z (6 months ago)
- Topics: aws-ecs, docker, dvc, ensemble-learning, fastapi, grafana, mlflow, mlops, prometheus
- Language: Python
- Homepage:
- Size: 74.2 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Flight Price Prediction - MLOps Project
Production-ready MLOps pipeline for flight price prediction using ensemble models, with complete AWS cloud deployment capability.
## Model Performance
| Metric | Value |
|--------|-------|
| R² Score | **0.9838** |
| MAE | 1,559 INR |
| RMSE | 2,891 INR |
| MAPE | 12.07% |
## Tech Stack









- **ML**: Random Forest + XGBoost + LightGBM (Ensemble)
- **Pipeline**: DVC for data versioning, MLflow for experiment tracking
- **API**: FastAPI with Prometheus metrics
- **Infrastructure**: Docker, AWS ECS Fargate, ECR
- **Monitoring**: Grafana, Prometheus, PostgreSQL logging
## Quick Start
### 1. Setup Environment
```bash
# Clone and setup
git clone
cd FlightPricePrediction
# Create virtual environment
python -m venv venv
venv\Scripts\activate # Windows
# source venv/bin/activate # Linux/Mac
# Install dependencies
pip install -r requirements.txt
# Configure environment
cp .env.example .env
```
### 2. Start Local Services
```bash
cd infra
docker-compose up -d --build
```
This starts:
- **PostgreSQL**: localhost:5432
- **MLflow**: http://localhost:5000
- **Prometheus**: http://localhost:9090
- **Grafana**: http://localhost:3000 (admin/admin)
### 3. Train Model
```bash
# Using DVC pipeline (recommended)
dvc repro
# Or manually
python -m src.ml.data # Prepare data
python -m src.ml.train # Train model
python -m src.ml.evaluate # Evaluate
```
### 4. Promote to Production
```bash
python scripts/promote_model.py --version 1 --alias production
```
### 5. Test API
```bash
# Health check
curl http://localhost:8000/health
# Make prediction
curl -X POST http://localhost:8000/predict \
-H "Content-Type: application/json" \
-d '{
"airline": "Vistara",
"flight": "UK-123",
"source_city": "Delhi",
"departure_time": "Morning",
"stops": "zero",
"arrival_time": "Afternoon",
"destination_city": "Mumbai",
"class": "Economy",
"duration": 2.5,
"days_left": 15
}'
```
**API Documentation**: http://localhost:8000/docs
## AWS Deployment
### Prerequisites
- AWS CLI configured (`aws configure`)
- Docker installed
### Deploy to ECS Fargate
```bash
# 1. Create AWS resources
aws ecr create-repository --repository-name flightprice-app --region us-east-1
aws ecs create-cluster --cluster-name flightprice-cluster --region us-east-1
aws s3 mb s3://flightprice-mlops- --region us-east-1
# 2. Build and push Docker image
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin .dkr.ecr.us-east-1.amazonaws.com
docker build -t flightprice-app:latest -f docker/Dockerfile.app .
docker tag flightprice-app:latest .dkr.ecr.us-east-1.amazonaws.com/flightprice-app:latest
docker push .dkr.ecr.us-east-1.amazonaws.com/flightprice-app:latest
# 3. Create IAM role for ECS
aws iam create-role --role-name ecsTaskExecutionRole --assume-role-policy-document '{"Version":"2012-10-17","Statement":[{"Effect":"Allow","Principal":{"Service":"ecs-tasks.amazonaws.com"},"Action":"sts:AssumeRole"}]}'
aws iam attach-role-policy --role-name ecsTaskExecutionRole --policy-arn arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy
# 4. Register task definition and create service
aws ecs register-task-definition --cli-input-json file://infra/aws/ecs-task-definition-app-simple.json --region us-east-1
aws ecs create-service --cluster flightprice-cluster --service-name flightprice-app-service --task-definition flightprice-app-task --desired-count 1 --launch-type FARGATE --network-configuration "awsvpcConfiguration={subnets=[],securityGroups=[],assignPublicIp=ENABLED}" --region us-east-1
```
### Cleanup AWS Resources
```bash
aws ecs update-service --cluster flightprice-cluster --service flightprice-app-service --desired-count 0 --region us-east-1
aws ecs delete-service --cluster flightprice-cluster --service flightprice-app-service --region us-east-1
aws ecs delete-cluster --cluster flightprice-cluster --region us-east-1
aws ecr delete-repository --repository-name flightprice-app --force --region us-east-1
```
## Project Structure
```
FlightPricePrediction/
├── configs/ # YAML configurations
│ ├── base.yaml # Data & evaluation settings
│ └── training.yaml # Model hyperparameters
├── docker/ # Dockerfiles
├── infra/ # Infrastructure
│ ├── docker-compose.yaml # Local development
│ └── aws/ # ECS task definitions
├── src/
│ ├── app/api.py # FastAPI endpoints
│ ├── ml/
│ │ ├── data.py # Data preprocessing
│ │ ├── models.py # EnsembleModel class
│ │ ├── train.py # Training pipeline
│ │ └── evaluate.py # Model evaluation
│ ├── database/models.py # SQLAlchemy ORM
│ └── monitoring/ # Drift detection
├── scripts/ # Deployment scripts
├── tests/ # Pytest suite
├── dvc.yaml # DVC pipeline
└── requirements.txt
```
## API Endpoints
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/health` | GET | Health check |
| `/predict` | POST | Make prediction |
| `/model_info` | GET | Current model info |
| `/reload` | POST | Hot-reload model |
| `/metrics` | GET | Prometheus metrics |
| `/docs` | GET | Swagger UI |
## Configuration
### Model Weights (configs/training.yaml)
```yaml
ensemble:
weights:
random_forest: 0.35
xgboost: 0.40
lightgbm: 0.25
```
### Evaluation Thresholds (configs/base.yaml)
```yaml
evaluation:
thresholds:
min_r2: 0.75
max_rmse: 5000
max_mape: 0.15
```
## Monitoring
### Prometheus Metrics
- `app_request_count` - Request counter by endpoint
- `app_prediction_count` - Total predictions
- `app_prediction_value` - Price distribution
- `app_prediction_latency_ms` - Inference latency
### Data Drift Detection
```bash
python -m src.monitoring.drift_detection --hours 24
```
Generates HTML reports in `reports/` directory.
## Testing
```bash
# Run all tests
pytest -v
# With coverage
pytest --cov=src --cov-report=html
```
## CI/CD
GitHub Actions workflow (`.github/workflows/ci-cd.yaml`):
1. **Test** - Run pytest suite
2. **Lint** - Code quality (black, ruff)
3. **Build** - Docker image to ECR
4. **Deploy** - Update ECS service
## Troubleshooting
**Model not loading**: Check MLflow is running at http://localhost:5000
**Database errors**: Verify PostgreSQL container is healthy with `docker ps`
**API errors**: Check logs with `docker logs flightprice-app`
## License
MIT License