Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ECMWFCode4Earth/ml_drought
Machine learning to better predict and understand drought. Moving github.com/ml-clim
https://github.com/ECMWFCode4Earth/ml_drought
2019 copernicus machine-learning
Last synced: about 1 month ago
JSON representation
Machine learning to better predict and understand drought. Moving github.com/ml-clim
- Host: GitHub
- URL: https://github.com/ECMWFCode4Earth/ml_drought
- Owner: ECMWFCode4Earth
- Created: 2019-05-01T16:05:56.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2022-05-18T18:41:07.000Z (over 2 years ago)
- Last Synced: 2024-11-21T03:15:01.946Z (about 2 months ago)
- Topics: 2019, copernicus, machine-learning
- Language: Jupyter Notebook
- Homepage: https://ml-clim.github.io/drought-prediction/
- Size: 309 MB
- Stars: 93
- Watchers: 8
- Forks: 18
- Open Issues: 42
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- open-sustainable-technology - ml_drought - A Machine Learning Pipeline to Predict Vegetation Health. (Natural Resources / Soil and Land)
README
[![Build Status](https://travis-ci.com/esowc/ml_drought.svg?branch=master)](https://travis-ci.com/esowc/ml_drought)
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/esowc/ml_drought/blob/master/notebooks/docs/Pipeline.ipynb)
# A Machine Learning Pipeline for Climate Science
This repository is an end-to-end pipeline for the creation, intercomparison and evaluation of machine learning methods in climate science.
The pipeline carries out a number of tasks to create a unified-data format for training and testing machine learning methods.
These tasks are split into the different classes defined in the `src` folder and explained further below:
NOTE: some basic working knowledge of Python is required to use this pipeline, although it is not too onerous
There are three entrypoints to the pipeline:
* [run.py](run.py)
* [notebooks](notebooks/docs)
* [scripts](scripts/README.md)A blog post describing the goals and design of the pipeline can be found
[here](https://medium.com/@gabrieltseng/a-machine-learning-pipeline-for-climate-research-ebf83b2b349a).View the initial presentation of our pipeline [here](https://www.youtube.com/watch?v=QVFiGERCiYs).
[Anaconda](https://www.anaconda.com/download/#macos) running python 3.7 is used as the package manager. To get set up
with an environment, install Anaconda from the link above, and (from this directory) run```bash
conda env create -f environment.yml
```
This will create an environment named `esowc-drought` with all the necessary packages to run the code. To
activate this environment, run```bash
conda activate esowc-drought
```[Docker](https://www.docker.com/) can also be used to run this code. To do this, first
run the docker app (either [docker desktop](https://www.docker.com/products/docker-desktop))
or configure the `docker-machine`:```bash
# on macOS
brew install docker-machine dockerdocker-machine create --driver virtualbox default
docker-machine env default
```
See [here](https://stackoverflow.com/a/33596140/9940782) for help on all machines or [here](https://stackoverflow.com/a/49719638/9940782)
for MacOS.Then build the docker image:
```bash
docker build -t ml_drought .
```Then, use it to run a container, mounting the data folder to the container:
```bash
docker run -it \
--mount type=bind,source=,target=/ml_drought/data \
ml_drought /bin/bash
```You will also need to create a .cdsapirc file with the following information:
```bash
url: https://cds.climate.copernicus.eu/api/v2
key:
verify: 1
```This pipeline can be tested by running `pytest`. [flake8](http://flake8.pycqa.org) is used for linting.
We use [mypy](https://github.com/python/mypy) for type checking. This can be run by running `mypy src` (this runs mypy on the `src` directory).
We use [black](https://black.readthedocs.io/en/stable/) for code formatting.
__Team:__ [@tommylees112](https://github.com/tommylees112), [@gabrieltseng](https://github.com/gabrieltseng)
For updates follow [@tommylees112](https://twitter.com/tommylees112) on twitter or look out for our blog posts!
- [Blog 1: Great News!](https://tommylees112.github.io/posts/2019/1/esowc_kick_off)
- [Blog 2: The Pipeline](https://medium.com/@gabrieltseng/a-machine-learning-pipeline-for-climate-research-ebf83b2b349a)
- [Blog 3: The Close of the Project!](https://tommylees112.github.io/posts/2019/2/esowc_final)## Acknowledgements
This was a project completed as part of the ECMWF Summer of Weather Code [Challenge #12](https://github.com/esowc/challenges_2019/issues/14). The challenge was setup to use [ECMWF/Copernicus open datasets](https://cds.climate.copernicus.eu/#!/home) to evaluate machine learning techniques for the **prediction of droughts**.Huge thanks to @ECMWF for making this project possible!