https://github.com/troutlytics/troutlytics-backend

Backend support to provide updated information about trout stocking in Washington state. This repository contains all the essential backend components of the project, including database management, Web API, and a web scraper.
https://github.com/troutlytics/troutlytics-backend

contributions-welcome contributors data-visualization docker fastapi fish fishing folium lakes maps python sqlalchemy statistics trout washington wdfw webscraper webscraping website

Last synced: 3 months ago
JSON representation

Host: GitHub
URL: https://github.com/troutlytics/troutlytics-backend
Owner: troutlytics
License: apache-2.0
Created: 2022-05-23T16:11:36.000Z (over 3 years ago)
Default Branch: main
Last Pushed: 2025-05-29T16:26:34.000Z (8 months ago)
Last Synced: 2025-06-20T21:46:55.112Z (7 months ago)
Topics: contributions-welcome, contributors, data-visualization, docker, fastapi, fish, fishing, folium, lakes, maps, python, sqlalchemy, statistics, trout, washington, wdfw, webscraper, webscraping, website
Language: Python
Homepage:
Size: 7.63 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md

Awesome Lists containing this project

README

# 🐟 Troutlytics Backend

[![Python application](https://github.com/troutlytics/troutlytics-backend/actions/workflows/python-app.yml/badge.svg)](https://github.com/troutlytics/troutlytics-backend/actions/workflows/python-app.yml)

## Description

**Troutlytics** is a data-driven Python application that scrapes and stores trout stocking data for Washington State lakes. It runs on a scheduled AWS Fargate task and stores results in an Aurora PostgreSQL database for use in dashboards, maps, and analysis tools.

---

## 📦 Project Structure

```bash
.
├── api/ # 🎯 Main application API
│ ├── __init__.py # API package initializer
│ ├── index.py # API entrypoint (routes/controllers)
│ ├── requirements.txt # API dependencies
│ ├── dockerfiles/
│ │ ├── dev/Dockerfile # Dev Dockerfile
│ │ └── prod/ # Production Dockerfile (Lambda-ready)
│ │ ├── Dockerfile
│ │ └── lambda_entry_script.sh
│ └── README.md # API-specific usage docs

├── web_scraper/ # 🕸️ Web scraping service
│ ├── __init__.py
│ ├── scraper.py # Main script for collecting trout/creel data
│ ├── Dockerfile # Docker setup for scraper
│ ├── Makefile # Shortcuts for common dev tasks
│ ├── requirements.txt
│ ├── README.md
│ └── tests/ # 🔬 Pytest-based tests
│ ├── __init__.py
│ └── test_scraper.py

├── data/ # 🗃️ Database models and storage
│ ├── __init__.py
│ ├── database.py # SQLAlchemy engine and session config
│ ├── models.py # ORM models for tables
│ ├── backup_data.sql # SQL dump for backup or restore
│ ├── backup_data.txt # Raw text backup
│ └── sqlite.db # Local development database

├── aws_config/ # ☁️ AWS deployment and secrets setup
│ ├── configure-aws-credentials-latest.yml # GitHub Actions for AWS login
│ └── fargate-rds-secrets.yaml # Fargate setup with RDS and Secrets Manager served there)
├── README.md # You are here 📘
```

⸻

🚀 Deployment Overview

AWS Infrastructure:

- Fargate runs the scraper every 24 hours via EventBridge.
- Secrets Manager securely stores DB credentials.
- Aurora PostgreSQL stores structured stocking data.
- CloudWatch Logs tracks runtime output for visibility.
- API hosted with API Gateway and Lambda

GitHub → ECR Workflow:

- Automatically builds and pushes Docker image on main branch updates.
- Uses secure OIDC GitHub Actions role to push to ECR.

⸻

📋 Prerequisites

- An AWS Account configured with appropriate permissions
- AWS CLI configured with appropriate permissions
- Docker installed (for local and prod builds)
- Python 3.11+

⸻

🧪 Run Locally

## 🚀 Docker Compose Commands Cheat Sheet

Everything is ran from the root repo folder

---

## 🛠️ Cloud Setup

Deploy the CloudFormation Stack:

```bash
aws cloudformation deploy \
--template-file aws_config/configure-aws-credentials-latest.yml \
--stack-name troutlytics-stack \
--capabilities CAPABILITY_NAMED_IAM \
--parameter-overrides \
ECRImageUriScraper=123456789012.dkr.ecr.us-west-2.amazonaws.com/scraper:latest \
ECRImageUriAPI=123456789012.dkr.ecr.us-west-2.amazonaws.com/api:latest \
VpcId=vpc-xxxxxxxx \
SubnetIds=subnet-aaaa,subnet-bbbb \
SecurityGroupId=sg-xxxxxxxx
```

⸻

🔐 GitHub → ECR Deploy (CI/CD)

To enable GitHub Actions auto-deploy:

1. Deploy the github_oidc_ecr_access.yaml CloudFormation template.
2. Add the output IAM Role ARN to your GitHub Actions secrets or workflows.
3. Push to main — your image builds and publishes to ECR automatically.

⸻

📈 Roadmap Ideas

- Add support for weather/streamflow overlays
- Enable historical trend analysis by lake
- Integrate public stocking alerts
- Expand scraper coverage to other regions or species

⸻

🧠 Credits

Created by @thomas-basham — U.S. Army veteran, full-stack developer, and passionate angler 🎣

⸻

License

MIT

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/troutlytics/troutlytics-backend

Awesome Lists containing this project

README