https://github.com/troutlytics/troutlytics-backend
Backend support to provide updated information about trout stocking in Washington state. This repository contains all the essential backend components of the project, including database management, Web API, and a web scraper.
https://github.com/troutlytics/troutlytics-backend
contributions-welcome contributors data-visualization docker fastapi fish fishing folium lakes maps python sqlalchemy statistics trout washington wdfw webscraper webscraping website
Last synced: 3 months ago
JSON representation
Backend support to provide updated information about trout stocking in Washington state. This repository contains all the essential backend components of the project, including database management, Web API, and a web scraper.
- Host: GitHub
- URL: https://github.com/troutlytics/troutlytics-backend
- Owner: troutlytics
- License: apache-2.0
- Created: 2022-05-23T16:11:36.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2025-05-29T16:26:34.000Z (7 months ago)
- Last Synced: 2025-06-20T21:46:55.112Z (7 months ago)
- Topics: contributions-welcome, contributors, data-visualization, docker, fastapi, fish, fishing, folium, lakes, maps, python, sqlalchemy, statistics, trout, washington, wdfw, webscraper, webscraping, website
- Language: Python
- Homepage:
- Size: 7.63 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
# π Troutlytics Backend
[](https://github.com/troutlytics/troutlytics-backend/actions/workflows/python-app.yml)
## Description
**Troutlytics** is a data-driven Python application that scrapes and stores trout stocking data for Washington State lakes. It runs on a scheduled AWS Fargate task and stores results in an Aurora PostgreSQL database for use in dashboards, maps, and analysis tools.
---
## π¦ Project Structure
```bash
.
βββ api/ # π― Main application API
β βββ __init__.py # API package initializer
β βββ index.py # API entrypoint (routes/controllers)
β βββ requirements.txt # API dependencies
β βββ dockerfiles/
β β βββ dev/Dockerfile # Dev Dockerfile
β β βββ prod/ # Production Dockerfile (Lambda-ready)
β β βββ Dockerfile
β β βββ lambda_entry_script.sh
β βββ README.md # API-specific usage docs
βββ web_scraper/ # πΈοΈ Web scraping service
β βββ __init__.py
β βββ scraper.py # Main script for collecting trout/creel data
β βββ Dockerfile # Docker setup for scraper
β βββ Makefile # Shortcuts for common dev tasks
β βββ requirements.txt
β βββ README.md
β βββ tests/ # π¬ Pytest-based tests
β βββ __init__.py
β βββ test_scraper.py
βββ data/ # ποΈ Database models and storage
β βββ __init__.py
β βββ database.py # SQLAlchemy engine and session config
β βββ models.py # ORM models for tables
β βββ backup_data.sql # SQL dump for backup or restore
β βββ backup_data.txt # Raw text backup
β βββ sqlite.db # Local development database
βββ aws_config/ # βοΈ AWS deployment and secrets setup
β βββ configure-aws-credentials-latest.yml # GitHub Actions for AWS login
β βββ fargate-rds-secrets.yaml # Fargate setup with RDS and Secrets Manager served there)
βββ README.md # You are here π
```
βΈ»
π Deployment Overview
AWS Infrastructure:
- Fargate runs the scraper every 24 hours via EventBridge.
- Secrets Manager securely stores DB credentials.
- Aurora PostgreSQL stores structured stocking data.
- CloudWatch Logs tracks runtime output for visibility.
- API hosted with API Gateway and Lambda
GitHub β ECR Workflow:
- Automatically builds and pushes Docker image on main branch updates.
- Uses secure OIDC GitHub Actions role to push to ECR.
βΈ»
π Prerequisites
- An AWS Account configured with appropriate permissions
- AWS CLI configured with appropriate permissions
- Docker installed (for local and prod builds)
- Python 3.11+
βΈ»
π§ͺ Run Locally
## π Docker Compose Commands Cheat Sheet
Everything is ran from the root repo folder
| Action | Command | Notes |
| :--------------------------- | :--------------------------------- | :-------------------------------------------- |
| **Build all services** | `docker compose build` | Build all images |
| **Start all services** | `docker compose up` | Start API(s), Scraper |
| **Start all + rebuild** | `docker compose up --build` | Force rebuild before starting |
| **Start dev API only** | `docker compose up api-dev` | Starts API Dev service |
| **Start prod API only** | `docker compose up api-prod` | Starts API Prod service |
| **Start scraper only** | `docker compose up web-scraper` | Starts Scraper |
| **Stop all services** | `docker compose down` | Stops and removes all containers and networks |
| **Rebuild dev API only** | `docker compose build api-dev` | Rebuild only the dev API image |
| **Rebuild prod API only** | `docker compose build api-prod` | Rebuild only the prod API image |
| **Rebuild scraper only** | `docker compose build web-scraper` | Rebuild only the scraper image |
| **View running containers** | `docker compose ps` | Show status of all services |
| **View logs (all services)** | `docker compose logs` | View logs for all services |
| **Follow logs live** | `docker compose logs -f` | Stream logs in real time |
| **Stop dev API** | `docker compose stop api-dev` | Stop only the dev API container |
| **Stop prod API** | `docker compose stop api-prod` | Stop only the prod API container |
| **Stop scraper** | `docker compose stop web-scraper` | Stop only the scraper container |
| **Restart all containers** | `docker compose restart` | Restart all running services |
---
## π οΈ Cloud Setup
Deploy the CloudFormation Stack:
```bash
aws cloudformation deploy \
--template-file aws_config/configure-aws-credentials-latest.yml \
--stack-name troutlytics-stack \
--capabilities CAPABILITY_NAMED_IAM \
--parameter-overrides \
ECRImageUriScraper=123456789012.dkr.ecr.us-west-2.amazonaws.com/scraper:latest \
ECRImageUriAPI=123456789012.dkr.ecr.us-west-2.amazonaws.com/api:latest \
VpcId=vpc-xxxxxxxx \
SubnetIds=subnet-aaaa,subnet-bbbb \
SecurityGroupId=sg-xxxxxxxx
```
βΈ»
π GitHub β ECR Deploy (CI/CD)
To enable GitHub Actions auto-deploy:
1. Deploy the github_oidc_ecr_access.yaml CloudFormation template.
2. Add the output IAM Role ARN to your GitHub Actions secrets or workflows.
3. Push to main β your image builds and publishes to ECR automatically.
βΈ»
π Roadmap Ideas
- Add support for weather/streamflow overlays
- Enable historical trend analysis by lake
- Integrate public stocking alerts
- Expand scraper coverage to other regions or species
βΈ»
π§ Credits
Created by @thomas-basham β U.S. Army veteran, full-stack developer, and passionate angler π£
βΈ»
License
MIT