Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/gher-uliege/DINCAE

DINCAE (Data-Interpolating Convolutional Auto-Encoder) is a neural network to reconstruct missing data in satellite observations.
https://github.com/gher-uliege/DINCAE

interpolation netcdf4 neural-network oceanography python remote-sensing satellite-observations tensorflow

Last synced: 16 days ago
JSON representation

DINCAE (Data-Interpolating Convolutional Auto-Encoder) is a neural network to reconstruct missing data in satellite observations.

Host: GitHub
URL: https://github.com/gher-uliege/DINCAE
Owner: gher-uliege
License: gpl-3.0
Created: 2019-06-21T10:33:01.000Z (over 5 years ago)
Default Branch: master
Last Pushed: 2024-04-18T11:30:30.000Z (9 months ago)
Last Synced: 2024-12-24T15:51:41.274Z (24 days ago)
Topics: interpolation, netcdf4, neural-network, oceanography, python, remote-sensing, satellite-observations, tensorflow
Language: Python
Homepage:
Size: 206 KB
Stars: 47
Watchers: 9
Forks: 19
Open Issues: 4
Metadata Files:
- Readme: README.md
- License: LICENSE.md

Awesome Lists containing this project

README

        [![documentation latest](https://img.shields.io/badge/docs-latest-blue.svg)](https://gher-uliege.github.io/DINCAE/)

[![DOI](https://zenodo.org/badge/193079989.svg)](https://zenodo.org/badge/latestdoi/193079989)

[![codecov.io](http://codecov.io/github/gher-ulg/DINCAE/coverage.svg?branch=master)](http://codecov.io/github/gher-ulg/DINCAE?branch=master)

# DINCAE

DINCAE (Data-Interpolating Convolutional Auto-Encoder) is a neural network to

reconstruct missing data in satellite observations which is described in the following open access paper:

https://doi.org/10.5194/gmd-13-1609-2020

*Note that this code is no longer maintained and has been superseeded by https://github.com/gher-ulg/DINCAE.jl*

## Installation

Python 3.6 or 3.7 with the modules:

* numpy (https://docs.scipy.org/doc/numpy/user/install.html)

* netCDF4 (https://unidata.github.io/netcdf4-python/netCDF4/index.html)

* TensorFlow 1.15 with GPU support (https://www.tensorflow.org/install)

Tested versions:

* Python 3.6.8

* netcdf4 1.4.2

* numpy 1.15.4

* Tensorflow version 1.15 (DINCAE does not work with TensforFlow 2.0; TensorFlow 1.5 does not work on python 3.8)

You can install those packages either with `pip3` or with `conda`.

## Documentation

The document is available at https://gher-uliege.github.io/DINCAE/.

## Input format

The input data should be in netCDF with the variables:

* `lon`: longitude (degrees East)

* `lat`: latitude (degrees North)

* `time`: time (days since 1900-01-01 00:00:00)

* `mask`: boolean mask where true means the data location is valid

* `SST` (or any other varbiable name): the data

This is the example output from `ncdump -h`:

```

netcdf avhrr_sub_add_clouds {

dimensions:

	time = UNLIMITED ; // (5266 currently)

	lat = 112 ;

	lon = 112 ;

variables:

	double lon(lon) ;

	double lat(lat) ;

	double time(time) ;

		time:units = "days since 1900-01-01 00:00:00" ;

	int mask(lat, lon) ;

	float SST(time, lat, lon) ;

		SST:_FillValue = -9999.f ;

}

```

An example for how to create this file in the examples folder:

* [python example](https://github.com/gher-ulg/DINCAE/blob/master/examples/create\_input\_file.py)

* [julia example](https://github.com/gher-ulg/DINCAE/blob/master/examples/create\_input\_file.jl)

## Running DINCAE

Copy the template file `run_DINCAE.py` and adapt the filename, variable name and the output directory and possibly optional arguments for the reconstruction method as mentioned in the [documentation](https://gher-ulg.github.io/DINCAE/).

The code can be run as follows:

```bash

python3 run_DINCAE.py

```

The output NetCDF files are contain the variables:

* `meandata` the time average of the input data used to compute the anomalies

* `mean_rec` and `sigma_rec`: the mean and standard deviation of the Gaussian probability distribution function of the reconstruction.  

In [Barth et al., 2020](https://doi.org/10.5194/gmd-13-1609-2020) the best results were obtained by averaging all the NetCDF files.

## Reducing GPU memory

Convolutional neural networks can require "a lot" of GPU memory. These parameters can affect GPU memory utilisation:

* reduce the mini-batch size

* use fewer layers (e.g. `enc_nfilter_internal` = [16,24,36] or [16,24])

* use less filters (reduce the values of the optional parameter enc_nfilter_internal)

* reduce `frac_dense_layer`, a parameter controlling the width of the dense layer in the bottleneck

* use a smaller domain or lower resolution

## Example results

[Link to animation](http://data-assimilation.net/upload/Alex/DINCAE/data-avg-DINCAE-AVHRR.gif)

More information about this result is given in the [linked paper](https://www.geosci-model-dev-discuss.net/gmd-2019-128/).