{"id":28516982,"url":"https://github.com/a5dur/pose-ckanext-metadata","last_synced_at":"2025-07-30T10:33:41.846Z","repository":{"id":297307949,"uuid":"996365152","full_name":"dathere/pose-ckanext-metadata","owner":"dathere","description":"Scripts for discovering, collecting, and cataloging metadata from CKAN extensions and instances worldwide.","archived":false,"fork":false,"pushed_at":"2025-07-28T11:06:56.000Z","size":132,"stargazers_count":2,"open_issues_count":0,"forks_count":2,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-07-28T13:11:19.374Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://catalog.civicdataecosystem.org/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dathere.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-04T21:03:55.000Z","updated_at":"2025-07-28T11:07:00.000Z","dependencies_parsed_at":"2025-06-05T02:24:29.639Z","dependency_job_id":"5afcbd53-d786-42c2-ba7b-8128cc02e80b","html_url":"https://github.com/dathere/pose-ckanext-metadata","commit_stats":null,"previous_names":["a5dur/pose-ckanext-metadata"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/dathere/pose-ckanext-metadata","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dathere%2Fpose-ckanext-metadata","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dathere%2Fpose-ckanext-metadata/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dathere%2Fpose-ckanext-metadata/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dathere%2Fpose-ckanext-metadata/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dathere","download_url":"https://codeload.github.com/dathere/pose-ckanext-metadata/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dathere%2Fpose-ckanext-metadata/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":267850819,"owners_count":24154505,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-30T02:00:09.044Z","response_time":70,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-06-09T04:13:57.041Z","updated_at":"2025-07-30T10:33:41.837Z","avatar_url":"https://github.com/dathere.png","language":"Python","readme":"# CKAN Ecosystem Metadata Collection\n\nThis repository contains automation scripts for sourcing and cataloging metadata from the CKAN ecosystem, including extensions and instances worldwide. The collected data powers the [CKAN Ecosystem Catalog](https://catalog.civicdataecosystem.org/) \n\n## Repository Structure\n\n### Extension Metadata Scripts\n- `1get_URL.py` - Discovers CKAN extensions on GitHub\n- `2refresh.py` - Updates extension metadata \n- `3update_catalog.py` - Synchronizes data with the catalog\n- `4uploadDataset.py` - Upload the file to datasets page\n\n### CKAN Instance Data Collection (`sites-data-fetch/`)\n- `0.csv` - Base dataset of CKAN instances\n- `1-Name-Process.py` - Processes site names and converts titles to link-friendly identifiers.\n- `2-CKANActionAPI copy.py` - Fetches data of instances via CKAN Action API\n- `3-siteType.py` - Categorizes site types\n- `4-Description.py` - Extracts site descriptions\n- `5-Use AI To deduct Location copy.py` - Infers geographic locations\n- `6-Geocode using OpenStreetMap Nominatim API.py` - Geocodes locations\n- `7-tstamp.py` - Adds timestamps to metadata\n\n## Automation\n\nThe repository includes a GitHub Actions workflow (`.github/workflows/update-ckan-metadata.yml`) that automatically fetches and updates extension metadata on a scheduled basis.\n\u003cimg width=\"2749\" height=\"3840\" alt=\"Untitled diagram _ Mermaid Chart-2025-07-16-114648\" src=\"https://github.com/user-attachments/assets/169fb3ee-4685-4051-9a5e-90f202b32988\" /\u003e\n\n## Setup\n\n1. Install dependencies:\n   ```bash\n   pip install -r requirements.txt\n   ```\n\n2. Configure CKAN API key \n\n3. Run the extension metadata collection:\n   ```bash\n   python 1get_URL.py\n   python 2refresh.py\n   python 3update_catalog.py\n   ```\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fa5dur%2Fpose-ckanext-metadata","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fa5dur%2Fpose-ckanext-metadata","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fa5dur%2Fpose-ckanext-metadata/lists"}