Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/opendatacube/odc-tools
ODC features that DEA is experimenting with or prototyping with the intention of being integrated into odc-core in the future
https://github.com/opendatacube/odc-tools
hacktoberfest opendatacube python3
Last synced: 4 months ago
JSON representation
ODC features that DEA is experimenting with or prototyping with the intention of being integrated into odc-core in the future
- Host: GitHub
- URL: https://github.com/opendatacube/odc-tools
- Owner: opendatacube
- License: apache-2.0
- Created: 2018-10-11T22:46:28.000Z (over 5 years ago)
- Default Branch: develop
- Last Pushed: 2024-03-04T22:53:56.000Z (4 months ago)
- Last Synced: 2024-03-04T23:50:53.502Z (4 months ago)
- Topics: hacktoberfest, opendatacube, python3
- Language: Python
- Homepage:
- Size: 4.46 MB
- Stars: 57
- Watchers: 20
- Forks: 31
- Open Issues: 38
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Security: SECURITY.md
Lists
- awesome-earthobservation-code - odc-tools - ODC features that DEA is experimenting with or prototyping with the intention of being integrated into odc-core in the future (Open Data Cube / Testing your code)
README
[![Test Status](https://github.com/opendatacube/odc-tools/actions/workflows/main.yml/badge.svg)](https://github.com/opendatacube/odc-tools/actions/workflows/main.yml)
[![codecov](https://codecov.io/gh/opendatacube/odc-tools/branch/develop/graph/badge.svg?token=PovpVLRFwn)](https://codecov.io/gh/opendatacube/odc-tools)DEA Prototype Code
==================This repository provides developmental [libraries](https://github.com/opendatacube/odc-tools/tree/develop/libs)
and [CLI tools](https://github.com/opendatacube/odc-tools/tree/develop/apps) for Open Datacube.- AWS S3 tools
- CLIs for using ODC data from AWS S3 and SQS
- Utilities for data visualizations in notebooks
- Experiments on optimising Rasterio usage on AWS S3Full list of libraries, and install instructions:
- `odc.ui` tools for data visualization in notebook/lab
- `odc.io` common IO utilities, used by apps mainly
- `odc-cloud[ASYNC,AZURE,THREDDS]` cloud crawling support package
- `odc.aws` AWS/S3 utilities, used by apps mainly
- `odc.aio` faster concurrent fetching from S3 with async, used by apps `odc-cloud[ASYNC]`
- `odc.{thredds,azure}` internal libs for cloud IO `odc-cloud[THREDDS,AZURE]`## Promoted to their own repositories
- `odc.stats` large scale processing framework (Moved to [odc-stats](http://github.com/opendatacube/odc-stats))
- `odc.stac` STAC to ODC conversion tools (Moved to [odc-stac](https://github.com/opendatacube/odc-stac))
- `odc.dscache` experimental key-value store where `key=UUID`, `value=Dataset` (moved to [odc-dscache](https://github.com/opendatacube/odc-dscache))Installation
============Libraries and applications in this repository are published to PyPI, and can be installed \
with `pip` like so:```
pip install \
odc-ui \
odc-stac \
odc-stats \
odc-io \
odc-cloud[ASYNC] \
odc-dscache
```For Conda Users
---------------Some **odc-tools** are available via `conda` from the `conda-forge` channel.
```
conda install -c conda-forge odc-apps-dc-tools odc-io odc-cloud```
Cloud Tools
===========Installation
------------Cloud tools depend on the `aiobotocore` package, which depends on specific
versions of `botocore`. Another package we use, `boto3`, also depends on
specific versions of `botocore`. As a result, having both `aiobotocore` and
`boto3` in one environment can be a bit tricky. The way to solve this
is to install `aiobotocore[awscli,boto3]` before anything else, which will install
compatible versions of `boto3` and `awscli` into the environment.```
pip install -U "aiobotocore[awscli,boto3]==1.3.3"
# OR for conda setups
conda install "aiobotocore==1.3.3" boto3 awscli
```1. For cloud (AWS only)
```
pip install odc-apps-cloud
```
2. For cloud (GCP, THREDDS and AWS)
```
pip install odc-apps-cloud[GCP,THREDDS]
```
2. For `dc-index-from-tar` (indexing to datacube from tar archive)
```
pip install odc-apps-dc-tools
```Apps
----1. `s3-find` list S3 bucket with wildcard
2. `s3-to-tar` fetch documents from S3 and dump them to a tar archive
3. `gs-to-tar` search GS for documents and dump them to a tar archive
4. `dc-index-from-tar` read yaml documents from a tar archive and add them to datacubeExample:
```bash
#!/bin/bashs3_src='s3://dea-public-data/L2/sentinel-2-nrt/**/*.yaml'
s3-find "${s3_src}" | \
s3-to-tar | \
dc-index-from-tar --env s2 --ignore-lineage
```Fastest way to list regularly placed files is to use fixed depth listing:
```bash
#!/bin/bash# only works when your metadata is same depth and has fixed file name
s3_src='s3://dea-public-data/L2/sentinel-2-nrt/S2MSIARD/*/*/ARD-METADATA.yaml's3-find --skip-check "${s3_src}" | \
s3-to-tar | \
dc-index-from-tar --env s2 --ignore-lineage
```When using Google Storage:
```bash
#!/bin/bash# Google Storage support
gs-to-tar --bucket data.deadev.com --prefix mangrove_cover
dc-index-from-tar --protocol gs --env mangroves --ignore-lineage metadata.tar.gz
```Local Development
=================The following steps are used in the GitHub Actions workflow `main.yml`
```bash
# build environment from file
mamba env create -f tests/test-env.yml# this environment name is defined in tests/test-env.yml file
conda activate odc-tools-tests# install additional packages
./scripts/dev-install.sh --no-deps# setup database for testing
./scripts/setup-test-db.sh# run test
echo "Running Tests"
pytest --cov=. \
--cov-report=html \
--cov-report=xml:coverage.xml \
--timeout=30 \
libs apps# Optional, to delete the environment
conda env remove -n odc-tools-tests
```Use `conda env update -f ` to install all needed dependencies for
`odc-tools` libraries and apps.Conda `environment.yaml` (click to expand)
```yaml
channels:
- conda-forge
dependencies:
# Datacube
- datacube>=1.8.5# odc.dscache
- python-lmdb
- zstandard# odc.ui
- ipywidgets
- ipyleaflet
- tqdm# odc-apps-dc-tools
- pystac>=1
- pystac-client>=0.2.0
- azure-storage-blob
- fsspec
- lxml # needed for thredds-crawler# odc.{aio,aws}: aiobotocore/boto3
# pin aiobotocore for easier resolution of dependencies
- aiobotocore==1.3.3
- boto3# eodatasets3 (used by odc-stats)
- boltons
- ciso8601
- python-rapidjson
- requests-cache
- ruamel.yaml
- structlog
- url-normalize# for dev
- pylint
- autopep8
- flake8
- isort
- black
- mypy# For tests
- pytest
- pytest-httpserver
- pytest-cov
- pytest-timeout
- moto
- deepdiff- pip>=20
- pip:
# odc.apps.dc-tools
- thredds-crawler# odc.stats
- eodatasets3# tests
- pytest-depends# odc.ui
- jupyter-ui-poll# odc-tools libs
- odc-stac
- odc-ui
- odc-dscache
- odc-stats# odc-tools CLI apps
- odc-apps-cloud
- odc-apps-dc-tools
```Release Process
===============1. Manually edit `{lib,app}/{pkg}/odc/{pkg}/_version.py` file to increase version number
2. Merge changes to the `develop` branch via a Pull Request
3. Fast-forward the `pypi/publish` branch to match `develop`
4. Push to GitHubSteps 3 and 4 can be done by an authorized user with
`./scripts/sync-publish-branch.sh` script.Publishing to [PyPi](https://pypi.org/) happens automatically when changes are
pushed to the protected `pypi/publish` branch. Only members of [Open Datacube
Admins](https://github.com/orgs/opendatacube/teams/admins) group have the
permission to push to this branch.