https://github.com/shaoanlu/munit-keras
A keras (tensorflow) reimplementation of MUNIT: Multimodal Unsupervised Image-to-Image Translation https://arxiv.org/abs/1804.04732
https://github.com/shaoanlu/munit-keras
Last synced: 3 months ago
JSON representation
A keras (tensorflow) reimplementation of MUNIT: Multimodal Unsupervised Image-to-Image Translation https://arxiv.org/abs/1804.04732
- Host: GitHub
- URL: https://github.com/shaoanlu/munit-keras
- Owner: shaoanlu
- License: mit
- Created: 2018-04-19T08:01:16.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2019-02-18T08:11:51.000Z (over 6 years ago)
- Last Synced: 2025-05-12T16:44:23.937Z (5 months ago)
- Language: Jupyter Notebook
- Homepage:
- Size: 1.55 MB
- Stars: 67
- Watchers: 11
- Forks: 14
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# MUNIT-keras
A keras (tensorflow) reimplementation of MUNIT: Multimodal Unsupervised Image-to-Image Translation### [Multimodal Unsupervised Image-to-Image Translation](https://arxiv.org/abs/1804.04732)
Xun Huang, Ming-Yu Liu, Serge Belongie, Jan Kautz### Deviation from the official implementation
1. ~~Use [group normalization](https://arxiv.org/abs/1803.08494) instead of layer normalization in upscaling blocks.~~
- Model using group norm (group=8) failed on reconstructing edge images of edges2shoe dataset.
2. Use [mixup](https://arxiv.org/abs/1710.09412) technique for training.
3. Input/Output size is defaulted 128x128.
4. Use only 3 res blocks (instead of 4) as default in content encoder/decoder in order to reduce training time.
- However, I'm worrying that this decreases the receptive field size so that the output quality becomes worse.
5. Upscaling blocks use conv2d having `kernel_size` = 3 instead of 4.
### Environment
- [Google Colab](https://colab.research.google.com/)
### Result
- **Edges2shoes** (config. 1)
- **Cyclic reconstruction loss weight = 1** for the first 80k iters and 0.3 for the rest.
- Input/Output size: 64x64.
- Training iterations: ~130k.
- Optimization: Use [mixup](https://arxiv.org/abs/1710.09412) technique for the first 80k iters.
- 
- **Edges2shoes** (config. 2)
- **Cyclic reconstruction loss weight = 10**
- Input/Output size: 64x64.
- Training iterations: ~70k.
- Optimization: Use [mixup](https://arxiv.org/abs/1710.09412) technique for the entire training process.
- 
- Model performed better on guided translation (generated more detail and clearer edges) when using high reconstruction loss?
### Acknowledgement
Code heavily inspired by [official MUNIT pytorch implementation](https://github.com/NVlabs/MUNIT). Also borrow code from [eridgd](https://github.com/eridgd/AdaIN-TF/blob/master/ops.py) and [tjwei](https://github.com/tjwei/GANotebooks).