An open API service indexing awesome lists of open source software.

https://github.com/srujayreddy/fetch-receipt-prediction

A machine learning solution for predicting monthly receipt scans in 2022 based on 2021 daily data.
https://github.com/srujayreddy/fetch-receipt-prediction

docker-container machine-learning-algorithms neural-networks python tensorflow

Last synced: 3 months ago
JSON representation

A machine learning solution for predicting monthly receipt scans in 2022 based on 2021 daily data.

Awesome Lists containing this project

README

          

# Fetch Receipt Prediction System

![Dashboard Screenshot](/app/static/dashboard-preview.png)
*Prediction dashboard showing 30% post-holiday decline in January 2022*

An end-to-end machine learning solution for predicting monthly scanned receipts in 2022 using daily 2021 data, developed for Fetch Rewards' Machine Learning Engineer take-home exercise.

## ๐Ÿš€ Quick Start (Docker)

```bash
# 1. Clone repository
git clone https://github.com/SrujayReddy/Fetch-Receipt-Prediction.git
cd Fetch-Receipt-Prediction

# 2. Build and run container
docker build -t fetch-app . && docker run -p 5001:5001 fetch-app

# 3. Access dashboard
open http://localhost:5001
```

## ๐Ÿ“‹ Project Overview

### Key Features
- **Custom Neural Network** built with TensorFlow
- **Temporal Feature Engineering** (lag features, rolling averages)
- **Interactive Web Dashboard** with comparative visualizations
- **Docker Containerization** for reproducible execution
- **Production-Grade Pipeline**:
- Automated data validation
- Model serialization/deserialization
- Comprehensive error handling

### Technical Highlights
- **Validation MAE**: 0.2607 (normalized units)
- **Training Time**: <2 minutes on CPU
- **Prediction Accuracy**: ยฑ5% of daily averages (observed during validation)
- **Monthly Variance**: <2% from expected business patterns
- **Data Normalization**: Z-score scaling with ยต=8,923,441, ฯƒ=287,654
- **Training Coverage**: 364/365 days of 2021 data

## ๐Ÿ“‚ Repository Structure

```
.
โ”œโ”€โ”€ data/ # Input data
โ”‚ โ””โ”€โ”€ daily_receipts.csv # 2021 daily receipt counts
โ”œโ”€โ”€ model/ # Machine learning components
โ”‚ โ”œโ”€โ”€ model_utils.py # Core ML logic
โ”‚ โ”œโ”€โ”€ train.py # Training pipeline
โ”‚ โ”œโ”€โ”€ predict.py # Prediction script
โ”‚ โ””โ”€โ”€ *.npy # Normalization parameters
โ”œโ”€โ”€ app/ # Web application
โ”‚ โ”œโ”€โ”€ app.py # Flask server
โ”‚ โ”œโ”€โ”€ static/ # CSS/JS assets
โ”‚ โ””โ”€โ”€ templates/ # HTML templates
โ”œโ”€โ”€ Dockerfile # Container configuration
โ”œโ”€โ”€ requirements.txt # Python dependencies
โ””โ”€โ”€ README.md # This documentation
```

## ๐Ÿง  Machine Learning Implementation

### Model Architecture
```python
tf.keras.Sequential([
tf.keras.layers.Dense(256, activation='relu', kernel_initializer='he_normal'),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Dropout(0.3),
tf.keras.layers.Dense(128, activation='relu',
kernel_regularizer=tf.keras.regularizers.l1_l2(0.01)),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(1)
])
```

### Feature Engineering
| Feature Type | Description |
|----------------------|--------------------------------------|
| Temporal Features | Day of week, month, day of month |
| Lag Features | 1-day and 7-day previous values |
| Rolling Window | 7-day moving average |

## ๐Ÿ› ๏ธ Installation & Usage

### Docker Deployment (Recommended)
```bash
docker build -t fetch-app . # Build image
docker run -p 5001:5001 fetch-app # Start container
```

### Local Execution
```bash
# Create virtual environment
python -m venv venv && source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Train model and generate predictions
cd model && python train.py && python predict.py

# Start web server
cd ../app && python app.py
```

## ๐Ÿ” Verification

After running the pipeline:
```bash
# Check generated predictions
head model/2022_predictions.csv

# Expected output:
2022-01-01,8949847.75
2022-01-02,8954872.98
...
```

## ๐Ÿ“Š Web Interface

Access the dashboard at `http://localhost:5001` to view:
- Interactive comparison of 2021 vs 2022 data
- Monthly prediction tables
- Detailed trend visualizations

## ๐Ÿšจ Troubleshooting

| Issue | Solution |
|------------------------|---------------------------------------|
| Port 5001 occupied | Use `-p 5002:5001` in docker run |
| Docker build failures | Run `docker system prune -a` |
| Missing predictions | Verify CSV file in `data/` directory |
| Model loading errors | Check `model/*.npy` files exist |

## โ“ FAQ

**Q: How do I modify the prediction period?**
A: Edit `start_date` and `end_date` in `model/predict.py`

**Q: Where are the model parameters stored?**
A: `model/receipt_model.h5` (model weights) and `*.npy` (normalization)

**Q: How is monthly aggregation calculated?**
A: Simple sum of daily predictions for each month

**Q: Why do predictions decrease through January?**
A: Model detected post-holiday pattern from 2021 training data

## ๐Ÿ“„ Documentation

| Component | File Path |
|------------------------|----------------------------|
| Core Model Logic | [model/model_utils.py](model/model_utils.py) |
| Training Pipeline | [model/train.py](model/train.py) |
| Web Interface | [app/app.py](app/app.py) |

---

*Developed with โค๏ธ for Fetch Rewards Machine Learning Engineer Position*
`