Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/aiff22/pynet
Generating RGB photos from RAW image files with PyNET
https://github.com/aiff22/pynet
camera computer-vision deep-learning image-enhancement image-processing image-reconstruction image-to-image-translation isp mobile photography photos pynet raw raw-to-rgb
Last synced: 6 days ago
JSON representation
Generating RGB photos from RAW image files with PyNET
- Host: GitHub
- URL: https://github.com/aiff22/pynet
- Owner: aiff22
- License: other
- Created: 2020-02-08T19:55:47.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2021-12-17T14:00:02.000Z (about 3 years ago)
- Last Synced: 2025-01-07T19:12:09.048Z (14 days ago)
- Topics: camera, computer-vision, deep-learning, image-enhancement, image-processing, image-reconstruction, image-to-image-translation, isp, mobile, photography, photos, pynet, raw, raw-to-rgb
- Language: Python
- Homepage: http://www.vision.ee.ethz.ch/~ihnatova/pynet.html
- Size: 44.9 KB
- Stars: 326
- Watchers: 11
- Forks: 71
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
## Replacing Mobile Camera ISP with a Single Deep Learning Model
#### 1. Overview [[Paper]](https://arxiv.org/pdf/2002.05509.pdf) [[PyTorch Implementation]](https://github.com/aiff22/PyNET-PyTorch) [[Project Webpage]](http://people.ee.ethz.ch/~ihnatova/pynet.html)
This repository provides the implementation of the RAW-to-RGB mapping approach and PyNET CNN presented in [this paper](https://arxiv.org/). The model is trained to convert **RAW Bayer data** obtained directly from mobile camera sensor into photos captured with a professional Canon 5D DSLR camera, thus replacing the entire hand-crafted ISP camera pipeline. The provided pre-trained PyNET model can be used to generate full-resolution **12MP photos** from RAW (DNG) image files captured using the Sony Exmor IMX380 camera sensor. More visual results of this approach for the Huawei P20 and BlackBerry KeyOne smartphones can be found [here](http://people.ee.ethz.ch/~ihnatova/pynet.html#demo).
#### 2. Prerequisites
- Python: scipy, numpy, imageio and pillow packages
- [TensorFlow 1.X](https://www.tensorflow.org/install/) + [CUDA cuDNN](https://developer.nvidia.com/cudnn)
- Nvidia GPU
#### 3. First steps
- Download the pre-trained [VGG-19 model](https://polybox.ethz.ch/index.php/s/7z5bHNg5r5a0g7k) [Mirror](https://drive.google.com/file/d/0BwOLOmqkYj-jMGRwaUR2UjhSNDQ/view?usp=sharing&resourcekey=0-Ff-0HUQsoKJxZ84trhsHpA) and put it into `vgg_pretrained/` folder.
- Download the pre-trained [PyNET model](https://drive.google.com/file/d/1txsJaCbeC-Tk53TPlvVk3IPpRw1Ro3BS/view?usp=sharing) and put it into `models/original/` folder.
- Download [Zurich RAW to RGB mapping dataset](http://people.ee.ethz.ch/~ihnatova/pynet.html#dataset) and extract it into `raw_images/` folder.
This folder should contain three subfolders: `train/`, `test/` and `full_resolution/`
*Please note that Google Drive has a quota limiting the number of downloads per day. To avoid it, you can login to your Google account and press "Add to My Drive" button instead of a direct download. Please check [this issue](https://github.com/aiff22/PyNET/issues/4) for more information.*
#### 4. PyNET CNN
PyNET architecture has an inverted pyramidal shape and is processing the images at **five different scales** (levels). The model is trained sequentially, starting from the lowest 5th layer, which allows to achieve good reconstruction results at smaller image resolutions. After the bottom layer is pre-trained, the same procedure is applied to the next level till the training is done on the original resolution. Since each higher level is getting **upscaled high-quality features** from the lower part of the model, it mainly learns to reconstruct the missing low-level details and refines the results. In this work, we are additionally using one transposed convolutional layer (Level 0) on top of the model that upsamples the image to its target size.
#### 5. Training the model
The model is trained level by level, starting from the lowest (5th) one:
```bash
python train_model.py level=
```Obligatory parameters:
>```level```: **```5, 4, 3, 2, 1, 0```**
Optional parameters and their default values:
>```batch_size```: **```50```** - batch size [small values can lead to unstable training]
>```train_size```: **```30000```** - the number of training patches randomly loaded each 1000 iterations
>```eval_step```: **```1000```** - each ```eval_step``` iterations the accuracy is computed and the model is saved
>```learning_rate```: **```5e-5```** - learning rate
>```restore_iter```: **```None```** - iteration to restore (when not specified, the last saved model for PyNET's ```level+1``` is loaded)
>```num_train_iters```: **```5K, 5K, 20K, 20K, 35K, 100K (for levels 5 - 0)```** - the number of training iterations
>```vgg_dir```: **```vgg_pretrained/imagenet-vgg-verydeep-19.mat```** - path to the pre-trained VGG-19 network
>```dataset_dir```: **```raw_images/```** - path to the folder with **Zurich RAW to RGB dataset**Below we provide the commands used for training the model on the Nvidia Tesla V100 GPU with 16GB of RAM. When using GPUs with smaller amount of memory, the batch size and the number of training iterations should be adjusted accordingly:
```bash
python train_model.py level=5 batch_size=50 num_train_iters=5000
python train_model.py level=4 batch_size=50 num_train_iters=5000
python train_model.py level=3 batch_size=48 num_train_iters=20000
python train_model.py level=2 batch_size=18 num_train_iters=20000
python train_model.py level=1 batch_size=12 num_train_iters=35000
python train_model.py level=0 batch_size=10 num_train_iters=100000
```
#### 6. Test the provided pre-trained models on full-resolution RAW image files
```bash
python test_model.py level=0 orig=true
```Optional parameters:
>```use_gpu```: **```true```**,**```false```** - run the model on GPU or CPU
>```dataset_dir```: **```raw_images/```** - path to the folder with **Zurich RAW to RGB dataset**
#### 7. Test the obtained model on full-resolution RAW image files
```bash
python test_model.py level=
```Obligatory parameters:
>```level```: **```5, 4, 3, 2, 1, 0```**
Optional parameters:
>```restore_iter```: **```None```** - iteration to restore (when not specified, the last saved model for level=`````` is loaded)
>```use_gpu```: **```true```**,**```false```** - run the model on GPU or CPU
>```dataset_dir```: **```raw_images/```** - path to the folder with **Zurich RAW to RGB dataset**
#### 8. Folder structure
>```models/``` - logs and models that are saved during the training process
>```models/original/``` - the folder with the provided pre-trained PyNET model
>```raw_images/``` - the folder with Zurich RAW to RGB dataset
>```results/``` - visual results for small image patches that are saved while training
>```results/full-resolution/``` - visual results for full-resolution RAW image data saved during the testing
>```vgg-pretrained/``` - the folder with the pre-trained VGG-19 network>```load_dataset.py``` - python script that loads training data
>```model.py``` - PyNET implementation (TensorFlow)
>```train_model.py``` - implementation of the training procedure
>```test_model.py``` - applying the pre-trained model to full-resolution test images
>```utils.py``` - auxiliary functions
>```vgg.py``` - loading the pre-trained vgg-19 network
#### 9. Bonus files
These files can be useful for further experiments with the model / dataset:
>```dng_to_png.py``` - convert raw DNG camera files to PyNET's input format
>```evaluate_accuracy.py``` - compute PSNR and MS-SSIM scores on Zurich RAW-to-RGB dataset for your own model
#### 10. License
Copyright (C) 2020 Andrey Ignatov. All rights reserved.
Licensed under the [CC BY-NC-SA 4.0 (Attribution-NonCommercial-ShareAlike 4.0 International)](https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode).
The code is released for academic research use only.
#### 11. Citation
```
@article{ignatov2020replacing,
title={Replacing Mobile Camera ISP with a Single Deep Learning Model},
author={Ignatov, Andrey and Van Gool, Luc and Timofte, Radu},
journal={arXiv preprint arXiv:2002.05509},
year={2020}
}
```#### 12. Any further questions?
```
Please contact Andrey Ignatov ([email protected]) for more information
```