{"id":13828710,"url":"https://github.com/akanimax/T2F","last_synced_at":"2025-07-09T06:33:10.162Z","repository":{"id":44377797,"uuid":"138257753","full_name":"akanimax/T2F","owner":"akanimax","description":"T2F: text to face generation using Deep Learning","archived":false,"fork":false,"pushed_at":"2022-05-14T20:34:32.000Z","size":521858,"stargazers_count":545,"open_issues_count":18,"forks_count":101,"subscribers_count":34,"default_branch":"master","last_synced_at":"2024-08-05T09:18:22.645Z","etag":null,"topics":["adversarial-machine-learning","gan","generative-adversarial-network","progressively-growing-gan","text-to-image"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/akanimax.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-06-22T05:19:07.000Z","updated_at":"2024-07-30T00:32:21.000Z","dependencies_parsed_at":"2022-09-12T04:01:26.600Z","dependency_job_id":null,"html_url":"https://github.com/akanimax/T2F","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/akanimax%2FT2F","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/akanimax%2FT2F/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/akanimax%2FT2F/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/akanimax%2FT2F/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/akanimax","download_url":"https://codeload.github.com/akanimax/T2F/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225492420,"owners_count":17482869,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["adversarial-machine-learning","gan","generative-adversarial-network","progressively-growing-gan","text-to-image"],"created_at":"2024-08-04T09:03:02.758Z","updated_at":"2025-07-09T06:33:10.155Z","avatar_url":"https://github.com/akanimax.png","language":"Python","readme":"# ! Attention !\n# This project is unfortunately not being worked upon\n### Please head over to the following much cooler project that takes the idea of Text-2-Image generation to the next level:\n\nDallE: [Original](https://github.com/openai/DALL-E) [PyTorch](https://github.com/lucidrains/DALLE-pytorch)\n\n\n#\n#\n#\n#\n#\n# :star: [NEW] :star:\n# T2F - 2.0 Teaser (coming soon ...)\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"https://raw.githubusercontent.com/akanimax/T2F/master/figures/T2F_2.0_teaser.jpeg\" alt=\"2.0 Teaser\"\u003e\n\u003c/p\u003e\n\n## Please note that all the faces in the above samples are generated ones. The T2F 2.0 will be using MSG-GAN for the image generation module instead of ProGAN. Please refer [link](https://github.com/akanimax/BMSG-GAN) for more info about MSG-GAN. This update to the repository will be comeing soon :+1:.\n\n# T2F\nText-to-Face generation using Deep Learning. This project combines two of the recent architectures \u003ca href=\"https://arxiv.org/abs/1710.10916\"\u003e StackGAN \u003c/a\u003e and \u003ca href=\"https://arxiv.org/abs/1710.10196\"\u003e ProGAN \u003c/a\u003e for synthesizing faces from textual descriptions.\u003cbr\u003e\nThe project uses \u003ca href=\"https://arxiv.org/abs/1803.03827\"\u003e Face2Text \u003c/a\u003e dataset which contains 400 facial images and textual captions for each of them. The data can be obtained by contacting either the **RIVAL** group or the authors of the aforementioned paper.\n\n\u003ch3\u003eSome Examples:\u003c/h3\u003e\n\u003cimg src=\"https://github.com/akanimax/T2F/blob/master/figures/result.jpeg\" alt=\"Examples\"\u003e\n\n\u003ch3\u003eArchitecture: \u003c/h3\u003e\n\u003cimg src=\"https://github.com/akanimax/T2F/blob/master/figures/architecture.jpg\" alt=\"Architecture Diagram\"\u003e\nThe textual description is encoded into a summary vector using an LSTM network. The summary vector, i.e. \u003cb\u003eEmbedding\u003c/b\u003e \u003ci\u003e(psy_t)\u003c/i\u003e as shown in the diagram is passed through the Conditioning Augmentation block (a single linear layer) to obtain the textual part of the latent vector (uses VAE like reparameterization technique) for the GAN as input. The second part of the latent vector is random gaussian noise. The latent vector so produced is fed to the generator part of the GAN, while the embedding is fed to the final layer of the discriminator for conditional distribution matching. The training of the GAN progresses exactly as mentioned in the ProGAN paper; i.e. layer by layer at increasing spatial resolutions. The new layer is introduced using the fade-in technique to avoid destroying previous learning.\n\n## Running the code:\nThe code is present in the `implementation/` subdirectory. The implementation is done using the \u003ca href=\"https://pytorch.org/\"\u003e PyTorch\u003c/a\u003e framework. So, for running this code, please install `PyTorch version 0.4.0` before continuing.\n\n__Code organization:__ \u003cbr\u003e\n`configs`: contains the configuration files for training the network. (You can use any one, or create your own) \u003cbr\u003e\n`data_processing`: package containing data processing and loading modules \u003cbr\u003e\n`networks`: package contains network implementation \u003cbr\u003e\n`processed_annotations`: directory stores output of running `process_text_annotations.py` script \u003cbr\u003e\n`process_text_annotations.py`: processes the captions and stores output in `processed_annotations/` directory. (no need to run this script; the pickle file is included in the repo.) \u003cbr\u003e\n`train_network.py`: script for running the training the network \u003cbr\u003e\n\n__Sample configuration:__\n\n    # All paths to different required data objects\n    images_dir: \"../data/LFW/lfw\"\n    processed_text_file: \"processed_annotations/processed_text.pkl\"\n    log_dir: \"training_runs/11/losses/\"\n    sample_dir: \"training_runs/11/generated_samples/\"\n    save_dir: \"training_runs/11/saved_models/\"\n\n    # Hyperparameters for the Model\n    captions_length: 100\n    img_dims:\n      - 64\n      - 64\n\n    # LSTM hyperparameters\n    embedding_size: 128\n    hidden_size: 256\n    num_layers: 3  # number of LSTM cells in the encoder network\n\n    # Conditioning Augmentation hyperparameters\n    ca_out_size: 178\n\n    # Pro GAN hyperparameters\n    depth: 5\n    latent_size: 256\n    learning_rate: 0.001\n    beta_1: 0\n    beta_2: 0\n    eps: 0.00000001\n    drift: 0.001\n    n_critic: 1\n\n    # Training hyperparameters:\n    epochs:\n      - 160\n      - 80\n      - 40\n      - 20\n      - 10\n    \n    # % of epochs for fading in the new layer\n    fade_in_percentage:\n      - 85\n      - 85\n      - 85\n      - 85\n      - 85\n\n    batch_sizes:\n      - 16\n      - 16\n      - 16\n      - 16\n      - 16\n\n    num_workers: 3\n    feedback_factor: 7  # number of logs generated per epoch\n    checkpoint_factor: 2  # save the models after these many epochs\n    use_matching_aware_discriminator: True  # use the matching aware discriminator\n\nUse the `requirements.txt` to install all the dependencies for the project. \n    \n    $ workon [your virtual environment]\n    $ pip install -r requirements.txt\n\n__Sample run:__\n\n    $ mkdir training_runs\n    $ mkdir training_runs/generated_samples training_runs/losses training_runs/saved_models\n    $ train_network.py --config=configs/11.comf\n\n\n## Other links:\nblog: https://medium.com/@animeshsk3/t2f-text-to-face-generation-using-deep-learning-b3b6ba5a5a93 \u003cbr\u003e\ntraining_time_lapse video: https://www.youtube.com/watch?v=NO_l87rPDb8 \u003cbr\u003e\nProGAN package (Seperate library): https://github.com/akanimax/pro_gan_pytorch\n\n## #TODO:\n1.) Create a simple `demo.py` for running inference on the trained models \u003cbr\u003e\n","funding_links":[],"categories":["Python","Paper implementations｜论文实现","Paper implementations"],"sub_categories":["Other libraries｜其他库:","Other libraries:"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fakanimax%2FT2F","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fakanimax%2FT2F","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fakanimax%2FT2F/lists"}