{"id":19553539,"url":"https://github.com/opengeos/aws-open-data","last_synced_at":"2025-07-11T07:04:06.920Z","repository":{"id":96384427,"uuid":"579784347","full_name":"opengeos/aws-open-data","owner":"opengeos","description":"A list of open datasets on AWS","archived":false,"fork":false,"pushed_at":"2025-07-09T05:02:21.000Z","size":7294,"stargazers_count":42,"open_issues_count":1,"forks_count":7,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-07-09T06:22:45.939Z","etag":null,"topics":["amazon-web-services","aws","data-science","deep-learning","geospatial","machine-learning","open-data"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/opengeos.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2022-12-18T22:39:14.000Z","updated_at":"2025-07-09T05:02:24.000Z","dependencies_parsed_at":"2023-10-11T07:04:50.044Z","dependency_job_id":"78656964-c33a-40c1-a0de-79b3abae539e","html_url":"https://github.com/opengeos/aws-open-data","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/opengeos/aws-open-data","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/opengeos%2Faws-open-data","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/opengeos%2Faws-open-data/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/opengeos%2Faws-open-data/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/opengeos%2Faws-open-data/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/opengeos","download_url":"https://codeload.github.com/opengeos/aws-open-data/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/opengeos%2Faws-open-data/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":264752531,"owners_count":23658650,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["amazon-web-services","aws","data-science","deep-learning","geospatial","machine-learning","open-data"],"created_at":"2024-11-11T04:23:41.635Z","updated_at":"2025-07-11T07:04:06.900Z","avatar_url":"https://github.com/opengeos.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# aws-open-data\n\n[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/giswqs/aws-open-data/blob/master/aws_open_datasets.ipynb)\n[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/giswqs/aws-open-data/HEAD?labpath=aws_open_datasets.ipynb)\n[![License](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\n## Introduction\n\nThe [AWS Open Data](https://registry.opendata.aws/) program hosts a lot of publicly available datasets. This repo compiles the list of all datasets on AWS as a CSV file and as a JSON file, making it easier to find and use them programmatically. The list is updated daily.\n\nA complete list of AWS open datasets as individual YAML files is available [here](https://github.com/awslabs/open-data-registry).\n\n## Usage\n\nThis repo provides the list of AWS open datasets in two formats:\n\n- Tab separated values (TSV) file: [aws_open_datasets.tsv](https://github.com/giswqs/aws-open-data/blob/master/aws_open_datasets.tsv)\n- JSON file: [aws_open_datasets.json](https://github.com/giswqs/aws-open-data/blob/master/aws_open_datasets.json)\n\nThe TSV file can be easily read into a Pandas DataFrame using the following code:\n\n```python\nimport pandas as pd\n\nurl = 'https://github.com/giswqs/aws-open-data/raw/master/aws_open_datasets.tsv'\ndf = pd.read_csv(url, sep='\\t')\ndf.head()\n```\n\n## Related Projects\n\n- A list of open datasets on AWS: [aws-open-data](https://github.com/giswqs/aws-open-data)\n- A list of open geospatial datasets on AWS: [aws-open-data-geo](https://github.com/giswqs/aws-open-data-geo)\n- A list of open geospatial datasets on AWS with a STAC endpoint: [aws-open-data-stac](https://github.com/giswqs/aws-open-data-stac)\n- A list of STAC endpoints from stacindex.org: [stac-index-catalogs](https://github.com/giswqs/stac-index-catalogs)\n- A list of geospatial datasets on Microsoft Planetary Computer: [Planetary-Computer-Catalog](https://github.com/giswqs/Planetary-Computer-Catalog)\n- A list of geospatial datasets on Google Earth Engine: [Earth-Engine-Catalog](https://github.com/giswqs/Earth-Engine-Catalog)\n- A list of geospatial datasets on NASA's Common Metadata Repository (CMR): [NASA-CMR-STAC](https://github.com/giswqs/NASA-CMR-STAC)\n- A list of geospatial data catalogs: [geospatial-data-catalogs](https://github.com/giswqs/geospatial-data-catalogs)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopengeos%2Faws-open-data","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fopengeos%2Faws-open-data","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopengeos%2Faws-open-data/lists"}