https://github.com/kimrass/pix2pix
'Pix2Pix' (Isola et al., 2017) implementation from scratch in PyTorch
https://github.com/kimrass/pix2pix
cgans cityscapes facades from-scratch pix2pix pytorch
Last synced: 3 months ago
JSON representation
'Pix2Pix' (Isola et al., 2017) implementation from scratch in PyTorch
- Host: GitHub
- URL: https://github.com/kimrass/pix2pix
- Owner: KimRass
- Created: 2023-07-20T15:05:00.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2024-01-25T13:05:34.000Z (over 1 year ago)
- Last Synced: 2024-01-26T08:35:58.838Z (over 1 year ago)
- Topics: cgans, cityscapes, facades, from-scratch, pix2pix, pytorch
- Language: Python
- Homepage:
- Size: 352 MB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
- [Image-to-Image Translation with Conditional Adversarial Networks](https://github.com/KimRass/Pix2Pix/blob/main/papers/Pix2Pix)
## 1. Pre-trained Models
|||
|-|-|
| Trained on Facades for 200 epochs | [pix2pix_facades.pth](https://drive.google.com/file/d/1sSro8prPTV5MddkFohaiIqdznreAnAyU/view?usp=sharing) |
| Trained on Google Maps for 400 epochs | [pix2pix_google_maps.pth](https://drive.google.com/file/d/1_mt4K-0Z2x1DxA0f2om9VaAEFamMfROU/view?usp=sharing) |# 2. Sampling
- [Test set of Facades dataset](https://github.com/KimRass/pix2pix_from_scratch/blob/main/generated_images/facades_test_set/)
-![]()
- [Test set of Google maps dataset](https://github.com/KimRass/pix2pix_from_scratch/blob/main/generated_images/google_maps_test_set/)
-# 3. Implementation Details
## 1) Image Mean and STD
- Facades dataset의 Training set에 대해 Input image와 Output image 각각에 대해 Mean과 STD를 계산하면 다음과 같습니다.
```python
FACADES_INPUT_IMG_MEAN = (0.222, 0.299, 0.745)
FACADES_INPUT_IMG_STD = (0.346, 0.286, 0.336)
FACADES_OUTPUT_IMG_MEAN = (0.478, 0.453, 0.417)
FACADES_OUTPUT_IMG_STD = (0.243, 0.235, 0.236)
```
- 반면 다음과 같이 설정하면 모델에 입력되는 모든 tensors의 값이 $[-1, 1]$의 값을 갖게 됩니다.
```python
FACADES_INPUT_IMG_MEAN = (0.5, 0.5, 0.5)
FACADES_INPUT_IMG_STD = (0.5, 0.5, 0.5)
FACADES_OUTPUT_IMG_MEAN = (0.5, 0.5, 0.5)
FACADES_OUTPUT_IMG_STD = (0.5, 0.5, 0.5)
```
- 두 가지 Settings를 가지고 실험을 해 본 결과, 후자의 학습 속도가 전자보다 빨랐습니다.
## 2) Architecture
- `self.norm = nn.InstanceNorm2d(out_channels, affine=True, track_running_stats=False)`로 설정 시 다음과 같이 모델이 생성한 이미지가 다음과 같이 Blurry했습니다.
-![]()
- `self.norm = nn.InstanceNorm2d(out_channels, affine=False, track_running_stats=False)`로 수정하자 이런 현상이 없어졌습니다.
- Instance norm은 원래 기본적으로 `track_running_stats=False`을 사용합니다.