https://github.com/simonpf/chimp_smhi
CHIMP retrievals for SMHI.
https://github.com/simonpf/chimp_smhi
Last synced: 15 days ago
JSON representation
CHIMP retrievals for SMHI.
- Host: GitHub
- URL: https://github.com/simonpf/chimp_smhi
- Owner: simonpf
- Created: 2024-01-18T14:38:04.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2025-03-25T13:40:48.000Z (about 1 year ago)
- Last Synced: 2026-04-18T22:00:07.280Z (about 2 months ago)
- Language: Python
- Size: 4.71 MB
- Stars: 2
- Watchers: 3
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# CHIMP retrievals for SMHI
This repository provides preprocessing functionality and instructions for
running a CHIMP retrieval on SEVIRI observations in HRIT format.
## Installation
All software required for running a specific version of the retrievals is listed in the
`chimp_smhi_.yml` file (where "" can be `v0` for instance), which provides a conda environment named
`chimp_smhi_`. To install and activate it, run (here for version `v1`):
``` shellsession
conda env create -f chimp_smhi_v1.yml
conda activate chimp_smhi_v1
```
### Downloading the model file
The retrieval models can be downloaded from:
- Version 0: [https://rain.atmos.colostate.edu/gprof_nn/chimp/chimp_smhi_v0.pt](https://rain.atmos.colostate.edu/gprof_nn/chimp/chimp_smhi_v0.pt).
- Version 1: [https://rain.atmos.colostate.edu/gprof_nn/chimp/chimp_smhi_v1.pt](https://rain.atmos.colostate.edu/gprof_nn/chimp/chimp_smhi_v1.pt).
- Version 2: [https://rain.atmos.colostate.edu/gprof_nn/chimp/chimp_smhi_v2.pt](https://rain.atmos.colostate.edu/gprof_nn/chimp/chimp_smhi_v1.pt).
- Version 3: [https://huggingface.co/simonpf/chimp_smhi](https://huggingface.co/simonpf/chimp_smhi).
## Running retrievals
Running CHIMP retrievals on SEVIRI files in HRIT format involves two steps: First the SEVIRI input files must be converted to the input data format expected by CHIMP. Secondly, input files must be processed using the ``chimp`` command.
### Extracting the CHIMP input data
The ``hrit2chimp.py`` script implements a command line application to convert all SEVIRI files in a given input folder to corresponding CHIMP input files. It can be used as follows.
``` shellsession
python hrit2chimp.py
```
The script combines the observations from all SEVIRI channels and writes them into a single CHIMP input file. The seviri input files are written to ``/seviri`` since `chimp` expects the inputs from all sensors to be organized into respective subfolders.
### Running CHIMP
Assuming ``hrit2chimp.py`` has been used to write CHIMP input files to ````, the retrieval can be run using:
``` shellsession
chimp process -v seviri --device cpu
```
For the sequence-based model the process command also needs to specify the number of input steps using the ``sequence_length`` option.
``` shellsession
chimp process -v seviri --device cpu --sequence_length 16
```
> ***NOTE:*** The conda-environment contains the CPU-only version of PyTorch. Therefore, retrievals can only be run on the CPU. Since the default is running retrievals on the GPU, the ``--device cpu`` flag must be passed when ``chimp`` is invoked.
### Model versions
### ``chimp_smhi_v0``
- ResNeXt architecture with 5M parameters
- Trained on 1-year of collocations
- Scene size 128
### ``chimp_smhi_v1``
- EfficientNet-V2 architecture with 20M parameters
- Trained on 1-year of collocations
- Scene size 256
> **NOTE:** The ``chimp_smhi_v1`` models should be run with a tile size of 256.
### ``chimp_smhi_v2``
- EfficientNet-V2 2p1 architecture with ~40M parameters
- Trained on 2-year of collocations over Europe and the Nordics
- Scene size 256
> **NOTE:** The ``chimp_smhi_v2`` models should be run with a tile size of 256 and
a sequence length of 16.
### ``chimp_smhi_v3``
There are two ``chimp_smhi`` version 3 models. The ``chimp_smhi_v3`` model processes single inputs, while the ``chimp_smhi_v3_seq`` model processes multiple inputs.
> **NOTE:** The ``chimp_smhi_v3`` model should be run with a tile size of 256.
> **NOTE:** The ``chimp_smhi_v3_seq`` model should be run with a tile size of 256 and a sequence length of 16.
## Results
The results are written as NetCDF4 datasets to the provided output directory.
Currently the only retrieved variable is ``dbz_mean``. Since ``chimp``
retrievals are probabilistic, the ``_mean`` suffix is added to the variable name
highlight that it is the expected value of the retrieved posterior distribution.
## Example
The animation below compares the retrieved radar reflectivity for the different model versions.
