Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/opengeos/aws-open-data
A list of open datasets on AWS
https://github.com/opengeos/aws-open-data
amazon-web-services aws data-science deep-learning geospatial machine-learning open-data
Last synced: 3 days ago
JSON representation
A list of open datasets on AWS
- Host: GitHub
- URL: https://github.com/opengeos/aws-open-data
- Owner: opengeos
- License: mit
- Created: 2022-12-18T22:39:14.000Z (almost 2 years ago)
- Default Branch: master
- Last Pushed: 2024-11-07T04:58:41.000Z (7 days ago)
- Last Synced: 2024-11-07T05:33:18.505Z (7 days ago)
- Topics: amazon-web-services, aws, data-science, deep-learning, geospatial, machine-learning, open-data
- Language: Python
- Homepage:
- Size: 4.73 MB
- Stars: 37
- Watchers: 2
- Forks: 5
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# aws-open-data
[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/giswqs/aws-open-data/blob/master/aws_open_datasets.ipynb)
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/giswqs/aws-open-data/HEAD?labpath=aws_open_datasets.ipynb)
[![License](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)## Introduction
The [AWS Open Data](https://registry.opendata.aws/) program hosts a lot of publicly available datasets. This repo compiles the list of all datasets on AWS as a CSV file and as a JSON file, making it easier to find and use them programmatically. The list is updated daily.
A complete list of AWS open datasets as individual YAML files is available [here](https://github.com/awslabs/open-data-registry).
## Usage
This repo provides the list of AWS open datasets in two formats:
- Tab separated values (TSV) file: [aws_open_datasets.tsv](https://github.com/giswqs/aws-open-data/blob/master/aws_open_datasets.tsv)
- JSON file: [aws_open_datasets.json](https://github.com/giswqs/aws-open-data/blob/master/aws_open_datasets.json)The TSV file can be easily read into a Pandas DataFrame using the following code:
```python
import pandas as pdurl = 'https://github.com/giswqs/aws-open-data/raw/master/aws_open_datasets.tsv'
df = pd.read_csv(url, sep='\t')
df.head()
```## Related Projects
- A list of open datasets on AWS: [aws-open-data](https://github.com/giswqs/aws-open-data)
- A list of open geospatial datasets on AWS: [aws-open-data-geo](https://github.com/giswqs/aws-open-data-geo)
- A list of open geospatial datasets on AWS with a STAC endpoint: [aws-open-data-stac](https://github.com/giswqs/aws-open-data-stac)
- A list of STAC endpoints from stacindex.org: [stac-index-catalogs](https://github.com/giswqs/stac-index-catalogs)
- A list of geospatial datasets on Microsoft Planetary Computer: [Planetary-Computer-Catalog](https://github.com/giswqs/Planetary-Computer-Catalog)
- A list of geospatial datasets on Google Earth Engine: [Earth-Engine-Catalog](https://github.com/giswqs/Earth-Engine-Catalog)
- A list of geospatial datasets on NASA's Common Metadata Repository (CMR): [NASA-CMR-STAC](https://github.com/giswqs/NASA-CMR-STAC)
- A list of geospatial data catalogs: [geospatial-data-catalogs](https://github.com/giswqs/geospatial-data-catalogs)