{"id":13815322,"url":"https://github.com/kaonashi-tyc/zi2zi","last_synced_at":"2025-05-15T15:07:23.882Z","repository":{"id":37602806,"uuid":"82346425","full_name":"kaonashi-tyc/zi2zi","owner":"kaonashi-tyc","description":"Learning Chinese Character style with conditional GAN","archived":false,"fork":false,"pushed_at":"2019-08-09T04:02:31.000Z","size":46887,"stargazers_count":2617,"open_issues_count":58,"forks_count":483,"subscribers_count":70,"default_branch":"master","last_synced_at":"2025-04-07T21:09:59.778Z","etag":null,"topics":["chinese-characters","deep-learning","deep-neural-networks","deeplearning","generative-adversarial-networks","machine-learning","pix2pix","style-transfer","tensorflow"],"latest_commit_sha":null,"homepage":"https://kaonashi-tyc.github.io/2017/04/06/zi2zi.html","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kaonashi-tyc.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-02-17T23:18:04.000Z","updated_at":"2025-04-05T05:17:38.000Z","dependencies_parsed_at":"2022-07-12T16:33:11.422Z","dependency_job_id":null,"html_url":"https://github.com/kaonashi-tyc/zi2zi","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kaonashi-tyc%2Fzi2zi","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kaonashi-tyc%2Fzi2zi/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kaonashi-tyc%2Fzi2zi/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kaonashi-tyc%2Fzi2zi/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kaonashi-tyc","download_url":"https://codeload.github.com/kaonashi-tyc/zi2zi/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254364270,"owners_count":22058878,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chinese-characters","deep-learning","deep-neural-networks","deeplearning","generative-adversarial-networks","machine-learning","pix2pix","style-transfer","tensorflow"],"created_at":"2024-08-04T04:03:19.486Z","updated_at":"2025-05-15T15:07:18.873Z","avatar_url":"https://github.com/kaonashi-tyc.png","language":"Python","funding_links":[],"categories":["Python","Applications using GANs"],"sub_categories":["🔤 Font generation"],"readme":"# zi2zi: Master Chinese Calligraphy with Conditional Adversarial Networks\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"assets/intro.gif\" alt=\"animation\", style=\"width: 350px;\"/\u003e\n\u003c/p\u003e\n\n## Introduction\nLearning eastern asian language typefaces with GAN. zi2zi(字到字, meaning from character to character) is an application and extension of the recent popular [pix2pix](https://github.com/phillipi/pix2pix) model to Chinese characters.\n\nDetails could be found in this [**blog post**](https://kaonashi-tyc.github.io/2017/04/06/zi2zi.html).\n\n## Network Structure\n### Original Model\n![alt network](assets/network.png)\n\nThe network structure is based off pix2pix with the addition of category embedding and two other losses, category loss and constant loss, from [AC-GAN](https://arxiv.org/abs/1610.09585) and [DTN](https://arxiv.org/abs/1611.02200) respectively.\n\n### Updated Model with Label Shuffling\n\n![alt network](assets/network_v2.png)\n\nAfter sufficient training, **d_loss** will drop to near zero, and the model's performance plateaued. **Label Shuffling** mitigate this problem by presenting new challenges to the model. \n\nSpecifically, within a given minibatch, for the same set of source characters, we generate two sets of target characters: one with correct embedding labels, the other with the shuffled labels. The shuffled set likely will not have the corresponding target images to compute **L1\\_Loss**, but can be used as a good source for all other losses, forcing the model to further generalize beyond the limited set of provided examples. Empirically, label shuffling improves the model's generalization on unseen data with better details, and decrease the required number of characters.\n\nYou can enable label shuffling by setting **flip_labels=1** option in **train.py** script. It is recommended that you enable this after **d_loss** flatlines around zero, for further tuning.\n\n## Gallery\n### Compare with Ground Truth\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"assets/compare3.png\" alt=\"compare\" width=\"600\"/\u003e\n\u003c/p\u003e\n\n### Brush Writing Fonts\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"assets/cj_mix.png\" alt=\"brush\"  width=\"600\"/\u003e\n\u003c/p\u003e\n\n### Cursive Script (Requested by SNS audience)\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"assets/cursive.png\" alt=\"cursive\"  width=\"600\"/\u003e\n\u003c/p\u003e\n\n\n### Mingchao Style (宋体/明朝体)\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"assets/mingchao4.png\" alt=\"gaussian\"  width=\"600\"/\u003e\n\u003c/p\u003e\n\n### Korean\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"assets/kr_mix_v2.png\" alt=\"korean\"  width=\"600\"/\u003e\n\u003c/p\u003e\n\n### Interpolation\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"assets/transition.png\" alt=\"animation\",  width=\"600\"/\u003e\n\u003c/p\u003e\n\n### Animation\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"assets/poem.gif\" alt=\"animation\",  width=\"250\"/\u003e\n  \u003cimg src=\"assets/ko_wiki.gif\" alt=\"animation\", width=\"250\"/\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"assets/reddit_bonus_humor_easter_egg.gif\" alt=\"easter egg\"  width=\"300\"/\u003e\n\u003c/p\u003e\n\n\n## How to Use\n### Step Zero\nDownload tons of fonts as you please\n### Requirement\n* Python 2.7\n* CUDA\n* cudnn\n* Tensorflow \u003e= 1.0.1\n* Pillow(PIL)\n* numpy \u003e= 1.12.1\n* scipy \u003e= 0.18.1\n* imageio\n\n### Preprocess\nTo avoid IO bottleneck, preprocessing is necessary to pickle your data into binary and persist in memory during training.\n\nFirst run the below command to get the font images:\n\n```sh\npython font2img.py --src_font=src.ttf\n                   --dst_font=tgt.otf\n                   --charset=CN \n                   --sample_count=1000\n                   --sample_dir=dir\n                   --label=0\n                   --filter=1\n                   --shuffle=1\n```\nFour default charsets are offered: CN, CN_T(traditional), JP, KR. You can also point it to a one line file, it will generate the images of the characters in it. Note, **filter** option is highly recommended, it will pre sample some characters and filter all the images that have the same hash, usually indicating that character is missing. **label** indicating index in the category embeddings that this font associated with, default to 0.\n\nAfter obtaining all images, run **package.py** to pickle the images and their corresponding labels into binary format:\n\n```sh\npython package.py --dir=image_directories\n                  --save_dir=binary_save_directory\n                  --split_ratio=[0,1]\n```\n\nAfter running this, you will find two objects **train.obj** and **val.obj** under the save_dir for training and validation, respectively.\n\n### Experiment Layout\n```sh\nexperiment/\n└── data\n    ├── train.obj\n    └── val.obj\n```\nCreate a **experiment** directory under the root of the project, and a data directory within it to place the two binaries. Assuming a directory layout enforce bettet data isolation, especially if you have multiple experiments running.\n### Train\nTo start training run the following command\n\n```sh\npython train.py --experiment_dir=experiment \n                --experiment_id=0\n                --batch_size=16 \n                --lr=0.001\n                --epoch=40 \n                --sample_steps=50 \n                --schedule=20 \n                --L1_penalty=100 \n                --Lconst_penalty=15\n```\n**schedule** here means in between how many epochs, the learning rate will decay by half. The train command will create **sample,logs,checkpoint** directory under **experiment_dir** if non-existed, where you can check and manage the progress of your training.\n\n### Infer and Interpolate\nAfter training is done, run the below command to infer test data:\n\n```sh\npython infer.py --model_dir=checkpoint_dir/ \n                --batch_size=16 \n                --source_obj=binary_obj_path \n                --embedding_ids=label[s] of the font, separate by comma\n                --save_dir=save_dir/\n```\n\nAlso you can do interpolation with this command:\n\n```sh\npython infer.py --model_dir= checkpoint_dir/ \n                --batch_size=10\n                --source_obj=obj_path \n                --embedding_ids=label[s] of the font, separate by comma\n                --save_dir=frames/ \n                --output_gif=gif_path \n                --interpolate=1 \n                --steps=10\n                --uroboros=1\n```\n\nIt will run through all the pairs of fonts specified in embedding_ids and interpolate the number of steps as specified. \n\n### Pretrained Model\nPretained model can be downloaded [here](https://drive.google.com/open?id=0Bz6mX0EGe2ZuNEFSNWpTQkxPM2c) which is trained with 27 fonts, only generator is saved to reduce the model size. You can use encoder in the this pretrained model to accelerate the training process.\n## Acknowledgements\nCode derived and rehashed from:\n\n* [pix2pix-tensorflow](https://github.com/yenchenlin/pix2pix-tensorflow) by [yenchenlin](https://github.com/yenchenlin)\n* [Domain Transfer Network](https://github.com/yunjey/domain-transfer-network) by [yunjey](https://github.com/yunjey)\n* [ac-gan](https://github.com/buriburisuri/ac-gan) by [buriburisuri](https://github.com/buriburisuri)\n* [dc-gan](https://github.com/carpedm20/DCGAN-tensorflow) by [carpedm20](https://github.com/carpedm20)\n* [origianl pix2pix torch code](https://github.com/phillipi/pix2pix) by [phillipi](https://github.com/phillipi)\n\n## License\nApache 2.0\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkaonashi-tyc%2Fzi2zi","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkaonashi-tyc%2Fzi2zi","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkaonashi-tyc%2Fzi2zi/lists"}