{"id":46349154,"url":"https://github.com/cehbrecht/atlas-demo","last_synced_at":"2026-03-04T22:31:05.025Z","repository":{"id":328181416,"uuid":"1072492536","full_name":"cehbrecht/atlas-demo","owner":"cehbrecht","description":"Using DataLad for ATLAS data","archived":false,"fork":false,"pushed_at":"2025-12-12T12:53:00.000Z","size":179,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-12-12T17:48:37.237Z","etag":null,"topics":["atlas","compliance-checker","copernicus","datalad","git-annex"],"latest_commit_sha":null,"homepage":"https://radiantearth.github.io/stac-browser/#/external/https://raw.githubusercontent.com/cehbrecht/atlas-demo/main/catalogs/stac/catalog.json","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cehbrecht.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGES.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-10-08T19:47:07.000Z","updated_at":"2025-12-12T12:53:05.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/cehbrecht/atlas-demo","commit_stats":null,"previous_names":["cehbrecht/atlas-demo"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/cehbrecht/atlas-demo","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cehbrecht%2Fatlas-demo","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cehbrecht%2Fatlas-demo/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cehbrecht%2Fatlas-demo/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cehbrecht%2Fatlas-demo/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cehbrecht","download_url":"https://codeload.github.com/cehbrecht/atlas-demo/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cehbrecht%2Fatlas-demo/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30096725,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-04T21:59:23.547Z","status":"ssl_error","status_checked_at":"2026-03-04T21:57:50.415Z","response_time":59,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["atlas","compliance-checker","copernicus","datalad","git-annex"],"created_at":"2026-03-04T22:31:03.316Z","updated_at":"2026-03-04T22:31:04.981Z","avatar_url":"https://github.com/cehbrecht.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Atlas Demo\n\n[![License](https://img.shields.io/github/license/cehbrecht/atlas-demo)](LICENSE)\n![Python](https://img.shields.io/badge/python-≥3.10-blue)\n![Conda](https://img.shields.io/badge/environment-conda--forge-green)\n[![Build](https://github.com/cehbrecht/atlas-demo/actions/workflows/update_catalog.yml/badge.svg)](https://github.com/cehbrecht/atlas-demo/actions/workflows/update_catalog.yml)\n[![STAC Browser](https://img.shields.io/badge/STAC-Browser-green)](https://radiantearth.github.io/stac-browser/#/external/https://raw.githubusercontent.com/cehbrecht/atlas-demo/main/catalogs/stac/catalog.json)\n![DataLad](https://img.shields.io/badge/managed%20by-DataLad-orange)\n[![DOI](https://zenodo.org/badge/DOI/10.5072/zenodo.0000000.svg)](https://zenodo.org/record/0000000)\n\n---\n\n## Overview\n\nThis repository demonstrates how to manage climate Atlas NetCDF data using:\n\n- **DataLad** for version control and lightweight data management  \n- **CF/IOOS compliance checks** for metadata validation  \n- **STAC catalogs** for structured metadata and discovery  \n\nAll workflows are reproducible locally and run automatically on GitHub Actions when metadata updates are detected.\n\n\u003e The Zenodo badge above is for showcase purposes.  \n\u003e When you create a release and link the GitHub repository to Zenodo, a DOI will be automatically minted for each version.\n\n---\n\n## 🧰 Prepare Your System\n\nBefore running the workflow, ensure you have **DataLad** and **git-annex** installed.\n\n👉 See the **[DataLad Handbook – Installation Guide](https://handbook.datalad.org/en/latest/intro/installation.html)** for detailed instructions and platform-specific notes.\n\n### macOS\n\n```bash\n# Using Homebrew (recommended)\nbrew install datalad git-annex\n\n# Verify installation\ndatalad --version\ngit annex version\n```\n\n\u003e Alternatively, use Conda:\n\u003e ```bash\n\u003e conda install -c conda-forge datalad git-annex\n\u003e ```\n\n### Linux (Debian/Ubuntu)\n\n```bash\nsudo apt update\nsudo apt install datalad git-annex\n```\n\n### Linux (Fedora/RHEL)\n\n```bash\nsudo dnf install datalad git-annex\n```\n\nOnce installed, clone the dataset and you’re ready to go.\n\n---\n\n## Getting Started\n\n### 1. Clone the repository/dataset\n\n```bash\ndatalad clone https://github.com/cehbrecht/atlas-demo.git\ncd atlas-demo\n```\n\n### 2. Create and activate Conda environment\n\n```bash\nconda env create -f environment.yml\nconda activate atlas-demo\n```\n\n\u003e Optional: install DataLad via Conda if not installed system-wide  \n\u003e ```bash\n\u003e conda install -c conda-forge datalad git-annex\n\u003e ```\n\n---\n\n## Adding and Managing Data\n\nThe file **`atlas/atlas_urls.csv`** lists available datasets from an **external data source**.  \nEach row defines a *remote URL* and a *local storage path* inside `atlas/data/`.\n\nExample snippet:\n\n```csv\nurl,path\nhttps://data.example.org/cmip6/cd_CMIP6_ssp126_yr_2015-2100_v02.nc,atlas/data/v02/CMIP6/ssp126/cd_CMIP6_ssp126_yr_2015-2100_v02.nc\nhttps://data.example.org/cerra/cd_CERRA_yr_1985-2021_v02.nc,atlas/data/v02/CERRA/cd_CERRA_yr_1985-2021_v02.nc\n```\n\nTo register these datasets in your local DataLad dataset (without downloading the actual files):\n\n```bash\nmake addurls\n```\n\nThis creates lightweight references in `atlas/data/` that can be retrieved later on demand:\n\n```bash\ndatalad get atlas/data/\u003cfile\u003e.nc\n```\n\n---\n\n## Adding New Local Data\n\nTo add NetCDF files that are already available locally:\n\n1. Copy the files into the appropriate folder under `atlas/data/`.\n2. Extract metadata and validate the files by running the workflow:\n\n```bash\nmake update\n```\n\nOr step-by-step:\n\n```bash\nmake metadata    # extract STAC metadata for all available NetCDF files\nmake checks      # run CF/IOOS compliance checks\nmake catalogs    # generate STAC catalog\n```\n\n3. Save the new files and generated metadata to DataLad:\n\n```bash\ndatalad save -m \"Add new NetCDF data and metadata\"\n```\n\n4. Push your changes to GitHub:\n\n```bash\ngit push\n```\n\n\u003e **Note:** Only STAC catalogs are rebuilt automatically on GitHub via Actions.  \n\u003e Metadata extraction and CF checks must be run locally before committing.\n\n---\n\n## Cleaning Generated Files\n\nTo remove generated metadata and catalogs:\n\n```bash\nmake clean\n```\n\n---\n\n## GitHub Actions Workflow\n\n- Runs automatically on push or pull request affecting `atlas/metadata/**`\n- Builds and commits updated STAC catalogs under `catalogs/stac/`\n- Skips execution if no metadata changes are detected\n\n---\n\n## DataLad Resources\n\n- **[Official Handbook](https://handbook.datalad.org/en/latest/)** – complete guide  \n- **[Quick Guide](https://handbook.datalad.org/en/latest/intro/quickstart.html)** – get started quickly  \n- **[Cheat Sheet](https://handbook.datalad.org/en/latest/_downloads/datalad-cheatsheet.pdf)** – handy commands reference  \n\n---\n\n## Usage Tips (DataLad)\n\n- **Get file content**: `datalad get \u003cfile_or_dir\u003e`  \n- **Unlock a file for editing**: `datalad unlock \u003cfile\u003e`  \n- **Drop local content**: `datalad drop \u003cfile_or_dir\u003e`  \n- **Add new files**: `datalad add \u003cfile_or_dir\u003e`  \n- **Add files from URLs**: `datalad addurls -d . --fast atlas/atlas_urls.csv '{url}' '{path}'`  \n- **Check dataset status**: `datalad status`  \n- **Save changes**: `datalad save -m \"commit message\"`  \n\n\u003e Useful for working with large datasets without downloading all content.\n\n---\n\n## Directory Overview\n\n```\natlas-demo/\n├── atlas/                     # NetCDF data + metadata\n├── catalogs/                  # STAC catalogs\n├── scripts/                   # workflow scripts\n├── .github/workflows/         # GitHub Actions definitions\n├── environment.yml            # Conda environment\n├── Makefile                   # local workflow automation\n└── README.md\n```\n\n---\n\n## Quick Commands\n\n```bash\nmake help        # show help\nmake update      # run full local workflow\nmake metadata    # extract STAC metadata\nmake checks      # run CF compliance checks\nmake catalogs    # generate STAC catalog\nmake clean       # remove generated files\nmake lint        # lint Python scripts with Ruff\n```\n\n---\n\n### Explore \u0026 Download via STAC Browser\n\nEach STAC Item in the catalog now includes:\n\n- **`datalad` asset** – points to the local DataLad-managed file  \n- **`http` asset** – direct HTTP download link (for demo, using a fixed prefix URL)  \n\nYou can browse the catalog directly in a STAC Browser:\n\n👉 **[Open in STAC Browser](https://radiantearth.github.io/stac-browser/#/external/https://raw.githubusercontent.com/cehbrecht/atlas-demo/main/catalogs/stac/catalog.json)**\n\nTo download a file via HTTP:\n\n1. Click on an Item in the STAC Browser.\n2. Select the `\"http\"` asset.\n3. Copy the URL or download directly in your browser or via `wget`/`curl`.\n\n```bash\n# Example using curl\ncurl -O https://data.mips.climate.copernicus.eu/thredds/fileServer/esg_c3s-cica-atlas/v02/CMIP6/historical/cdbals_CMIP6_historical_yr_1850-2014_v02.nc\n\n---\n\n## License\n\nThis project is licensed under the terms of the [MIT License](LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcehbrecht%2Fatlas-demo","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcehbrecht%2Fatlas-demo","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcehbrecht%2Fatlas-demo/lists"}