https://github.com/sentinel-hub/cv4a-iclr-2020-starter-notebooks
Starter notebooks using eo-learn for the CV4A workshop at ICLR 2020
https://github.com/sentinel-hub/cv4a-iclr-2020-starter-notebooks
Last synced: 19 days ago
JSON representation
Starter notebooks using eo-learn for the CV4A workshop at ICLR 2020
- Host: GitHub
- URL: https://github.com/sentinel-hub/cv4a-iclr-2020-starter-notebooks
- Owner: sentinel-hub
- Created: 2020-03-02T08:12:07.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2020-03-10T16:10:02.000Z (about 5 years ago)
- Last Synced: 2025-03-27T15:52:16.414Z (about 1 month ago)
- Language: Jupyter Notebook
- Size: 2.95 MB
- Stars: 6
- Watchers: 4
- Forks: 4
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# cv4a-iclr-2020-starter-notebooks
Repository containing notebooks to get started on the [CV4A challenge at ICLR 2020](https://zindi.africa/competitions/iclr-workshop-challenge-2-radiant-earth-computer-vision-for-crop-recognition) using [`eo-learn`](https://github.com/sentinel-hub/eo-learn#eo-learn).
## Content
The [`cv4a-crop-challenge-to-eolearn`](./cv4a-crop-challenge-to-eolearn.ipynb) notebook converts the data provided as `.tiff` files into smaller `EOPatch` format files. This allows to better handle and visualise data as rasters, and to easily apply processing pipelines.
The [`cv4a-process-and-train`](./cv4a-process-and-train.ipynb) notebook shows how to set up a processing pipeline on `EOPatch` objects, such as cloud masking and feature interpolation. This way, different pre-processing methods can be quickly tested.
The pipeline shown in the notebook includes:
* conversion of cloud probabilities to cloud masks;
* dilation of cloud masks;
* `NDVI` computation;
* linear interpolation to fill missing values;
* utility tasks to get insights into data.
Features are then aggregated and used to train/evaluate a machine learning model.
In this starter's notebook, an untuned random forest classifier was trained on the temporal features, achieving a public score of **1.26628**.The [`SampleSubmission.csv`](./SampleSubmission.csv) template file is added for completion.
## Requirements
The notebook assume that the data has been downloaded according to the challenge instructions. Set the path to the data in the notebooks as `ROO_DATA_DIR`.
Installing `eo-learn` according to [instructions](https://github.com/sentinel-hub/eo-learn#pypi-distribution) should cover all dependencies used in the notebooks.
## Improvements
As already noted by organisers and participants, these additions should improve performance and generalisation of the methods:
* dealing with class imbalance (e.g. over-sampling, under-sampling, SMOTE);
* feature analysis and engineering adding domain specific indices. You can check [this repo](https://github.com/sentinel-hub/custom-scripts#sentinel-2) for inspiration on vegetation indices derived from Sentinel-2 data;
* use ML methods that better characterize temporal evolution of crops.
Good luck to all.