https://github.com/ECMWFCode4Earth/ml_drought

Machine learning to better predict and understand drought. Moving github.com/ml-clim
https://github.com/ECMWFCode4Earth/ml_drought

2019 copernicus machine-learning

Last synced: 5 months ago
JSON representation

Machine learning to better predict and understand drought. Moving github.com/ml-clim

Host: GitHub
URL: https://github.com/ECMWFCode4Earth/ml_drought
Owner: ECMWFCode4Earth
Created: 2019-05-01T16:05:56.000Z (almost 6 years ago)
Default Branch: master
Last Pushed: 2022-05-18T18:41:07.000Z (almost 3 years ago)
Last Synced: 2024-11-21T03:15:01.946Z (5 months ago)
Topics: 2019, copernicus, machine-learning
Language: Jupyter Notebook
Homepage: https://ml-clim.github.io/drought-prediction/
Size: 309 MB
Stars: 93
Watchers: 8
Forks: 18
Open Issues: 42
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

open-sustainable-technology - ml_drought - A Machine Learning Pipeline to Predict Vegetation Health. (Natural Resources / Soil and Land)

README

        
[![Build Status](https://travis-ci.com/esowc/ml_drought.svg?branch=master)](https://travis-ci.com/esowc/ml_drought)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/esowc/ml_drought/blob/master/notebooks/docs/Pipeline.ipynb)

# A Machine Learning Pipeline for Climate Science

This repository is an end-to-end pipeline for the creation, intercomparison and evaluation of machine learning methods in climate science.

The pipeline carries out a number of tasks to create a unified-data format for training and testing machine learning methods.

These tasks are split into the different classes defined in the `src` folder and explained further below:



NOTE: some basic working knowledge of Python is required to use this pipeline, although it is not too onerous

## Using the Pipeline 

There are three entrypoints to the pipeline:

* [run.py](run.py)

* [notebooks](notebooks/docs)

* [scripts](scripts/README.md)

A blog post describing the goals and design of the pipeline can be found

[here](https://medium.com/@gabrieltseng/a-machine-learning-pipeline-for-climate-research-ebf83b2b349a).

View the initial presentation of our pipeline [here](https://www.youtube.com/watch?v=QVFiGERCiYs).

## Setup 

[Anaconda](https://www.anaconda.com/download/#macos) running python 3.7 is used as the package manager. To get set up

with an environment, install Anaconda from the link above, and (from this directory) run

```bash

conda env create -f environment.yml

```

This will create an environment named `esowc-drought` with all the necessary packages to run the code. To

activate this environment, run

```bash

conda activate esowc-drought

```

[Docker](https://www.docker.com/) can also be used to run this code. To do this, first

run the docker app (either [docker desktop](https://www.docker.com/products/docker-desktop))

or configure the `docker-machine`:

```bash

# on macOS

brew install docker-machine docker

docker-machine create --driver virtualbox default

docker-machine env default

```

See [here](https://stackoverflow.com/a/33596140/9940782) for help on all machines or [here](https://stackoverflow.com/a/49719638/9940782)

for MacOS.

Then build the docker image:

```bash

docker build -t ml_drought .

```

Then, use it to run a container, mounting the data folder to the container:

```bash

docker run -it \

--mount type=bind,source=,target=/ml_drought/data \

ml_drought /bin/bash

```

You will also need to create a .cdsapirc file with the following information:

```bash

url: https://cds.climate.copernicus.eu/api/v2

key: 

verify: 1

```

### Testing  

This pipeline can be tested by running `pytest`. [flake8](http://flake8.pycqa.org) is used for linting.

We use [mypy](https://github.com/python/mypy) for type checking. This can be run by running `mypy src` (this runs mypy on the `src` directory).

We use [black](https://black.readthedocs.io/en/stable/) for code formatting.

__Team:__ [@tommylees112](https://github.com/tommylees112), [@gabrieltseng](https://github.com/gabrieltseng)

For updates follow [@tommylees112](https://twitter.com/tommylees112) on twitter or look out for our blog posts!

- [Blog 1: Great News!](https://tommylees112.github.io/posts/2019/1/esowc_kick_off)

- [Blog 2: The Pipeline](https://medium.com/@gabrieltseng/a-machine-learning-pipeline-for-climate-research-ebf83b2b349a)

- [Blog 3: The Close of the Project!](https://tommylees112.github.io/posts/2019/2/esowc_final)

## Acknowledgements 

This was a project completed as part of the ECMWF Summer of Weather Code [Challenge #12](https://github.com/esowc/challenges_2019/issues/14). The challenge was setup to use [ECMWF/Copernicus open datasets](https://cds.climate.copernicus.eu/#!/home) to evaluate machine learning techniques for the **prediction of droughts**.

Huge thanks to @ECMWF for making this project possible!

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ECMWFCode4Earth/ml_drought

Awesome Lists containing this project

README