{"id":49327090,"url":"https://github.com/wri/gfw-data-api","last_synced_at":"2026-04-26T20:32:53.028Z","repository":{"id":37485009,"uuid":"255779016","full_name":"wri/gfw-data-api","owner":"wri","description":"GFW Data API","archived":false,"fork":false,"pushed_at":"2026-03-11T16:56:05.000Z","size":6415,"stargazers_count":14,"open_issues_count":5,"forks_count":5,"subscribers_count":6,"default_branch":"master","last_synced_at":"2026-03-11T22:00:06.266Z","etag":null,"topics":["api-server","etl-pipeline","metadata-api"],"latest_commit_sha":null,"homepage":"https://data-api.globalforestwatch.org","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/wri.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2020-04-15T02:06:50.000Z","updated_at":"2026-03-02T22:19:34.000Z","dependencies_parsed_at":"2023-11-10T13:04:20.054Z","dependency_job_id":"4e40cb6a-50b1-4e84-af36-05aafc353f9d","html_url":"https://github.com/wri/gfw-data-api","commit_stats":{"total_commits":1724,"total_committers":17,"mean_commits":"101.41176470588235","dds":0.5324825986078887,"last_synced_commit":"3f33e77e0a6c6cda3083f96aabd15e852741e447"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/wri/gfw-data-api","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wri%2Fgfw-data-api","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wri%2Fgfw-data-api/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wri%2Fgfw-data-api/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wri%2Fgfw-data-api/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/wri","download_url":"https://codeload.github.com/wri/gfw-data-api/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wri%2Fgfw-data-api/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32312388,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-26T19:15:34.056Z","status":"ssl_error","status_checked_at":"2026-04-26T19:15:15.467Z","response_time":129,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["api-server","etl-pipeline","metadata-api"],"created_at":"2026-04-26T20:32:52.941Z","updated_at":"2026-04-26T20:32:53.016Z","avatar_url":"https://github.com/wri.png","language":"Python","readme":"# GFW Data API\nHigh-performance Async REST API, in Python. FastAPI + GINO + Uvicorn (powered by PostgreSQL).\n\n## Get Started\n### Run Locally with Docker\n#### GitHub Container Registry (GHCR) Access Setup\n\nTo authenticate Docker with GitHub Container Registry (`ghcr.io`) for pulling/pushing images, follow these steps:\n\n##### 1. Create a GitHub Personal Access Token (PAT)\n\n1. Navigate to: GitHub → Settings → Developer Settings → Personal Access Tokens → Tokens (Classic)\n2. Click **Generate new token (Classic)**\n3. Configure:\n  - **Note**: `docker-ghcr-access` (descriptive name)\n  - **Expiration**: Set duration (or \"No expiration\" for CI/CD)\n  - **Scopes**:\n    - `read:packages` (required for pull)\n    - `write:packages` (required for push)\n4. Click **Generate token** and copy the token value\n\n##### 2. Authenticate with Docker\n\n```bash\necho \"YOUR_GHCR_TOKEN\" | docker login ghcr.io -u GITHUB_USERNAME --password-stdin\n```\n\n#### Proceed with Setup\n\n1. Clone this Repository. `git clone https://github.com/wri/gfw-data-api.git`\n2. Run `./scripts/setup` from the root directory. (install `uv` first, if necessary.)\n3. Run locally using docker-compose. `./scripts/develop`\n\n### Developing\n* Activate the virtual environment installed with `scripts/setup`: `. .venv_uv/bin/activate`\n* Add a package as a project dependency, with minimum version: `uv add \"pydantic\u003e=2\"`\n* Re-lock one particular package upgrading it to the latest version allowed by pins in pyproject.toml: `uv lock --upgrade-package \u003cpackage_name\u003e`\n* Re-lock all packages, upgrading those with newer versions (but obeying version pins in pyproject.toml): `uv lock --upgrade`\n* Generate a DB Migration: `./scripts/migrate` (note `app/settings/prestart.sh` will run migrations automatically when running `/scripts/develop`)\n* Run tests: `./scripts/test` and `./scripts/test_v2`'\n  * `--no_build` - don't rebuild the containers\n  * `--moto-port=\u003cport_number\u003e` - explicitly sets the motoserver port (default `50000`)\n* Run specific tests: `./scripts/test tests/tasks/test_vector_source_assets.py::test_vector_source_asset`\n* Each development branch app instance gets its isolated database in AWS dev account that's cloned from `geostore` database. This database is named with the branch suffix (like `geostore_\u003cbranch_name\u003e`). If a PR includes a database migration, once the change is merged to higher environments, the `geostore` database needs to also be updated with the migration. This can be done by manually replacing the existing database by a copy of a cleaned up version of the branch database (see `./prestart.sh` script for cloning command).\n* Debug memory usage of Batch jobs with memory_profiler:\n    1. Install memory_profiler in the job's Dockerfile\n    2. Modify the job's script to run with memory_profiler. Ex: `pixetl \"${ARG_ARRAY[@]}\"` -\u003e `mprof run -M -C -T 1 --python /usr/local/app/gfw_pixetl/pixetl.py \"${ARG_ARRAY[@]}\"`\n    3. scp memory_profiler's .dat files off of the Batch instance (found in /tmp by default) while the instance is still up\n\n## Features\n### Core Dependencies\n* **FastAPI:** touts performance on-par with NodeJS \u0026 Go + automatic Swagger + ReDoc generation.\n* **GINO:** built on SQLAlchemy core. Lightweight, simple, asynchronous ORM for PostgreSQL.\n* **Uvicorn:** Lightning-fast, asynchronous ASGI server.\n* **Optimized Dockerfile:** Optimized Dockerfile for ASGI applications, from https://github.com/tiangolo/uvicorn-gunicorn-docker.\n\n#### Additional Dependencies\n* **Pydantic:** Core to FastAPI. Define how data should be in pure, canonical python; validate it with pydantic.\n* **Alembic:** Handles database migrations. Compatible with GINO.\n* **SQLAlchemy_Utils:** Provides essential handles \u0026 datatypes. Compatible with GINO.\n* **PostgreSQL:** Robust, fully-featured, scalable, open-source.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwri%2Fgfw-data-api","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwri%2Fgfw-data-api","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwri%2Fgfw-data-api/lists"}