{"id":13497852,"url":"https://github.com/openai/improved-diffusion","last_synced_at":"2025-05-14T23:04:22.312Z","repository":{"id":38323746,"uuid":"337207225","full_name":"openai/improved-diffusion","owner":"openai","description":"Release for Improved Denoising Diffusion Probabilistic Models","archived":false,"fork":false,"pushed_at":"2024-07-18T02:45:29.000Z","size":30,"stargazers_count":3503,"open_issues_count":111,"forks_count":506,"subscribers_count":120,"default_branch":"main","last_synced_at":"2025-04-11T10:17:07.984Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/openai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-02-08T20:56:01.000Z","updated_at":"2025-04-11T08:19:01.000Z","dependencies_parsed_at":"2024-07-31T23:10:02.806Z","dependency_job_id":null,"html_url":"https://github.com/openai/improved-diffusion","commit_stats":{"total_commits":10,"total_committers":2,"mean_commits":5.0,"dds":0.09999999999999998,"last_synced_commit":"1bc7bbbdc414d83d4abf2ad8cc1446dc36c4e4d5"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openai%2Fimproved-diffusion","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openai%2Fimproved-diffusion/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openai%2Fimproved-diffusion/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openai%2Fimproved-diffusion/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/openai","download_url":"https://codeload.github.com/openai/improved-diffusion/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254243358,"owners_count":22038046,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-31T20:00:43.009Z","updated_at":"2025-05-14T23:04:21.945Z","avatar_url":"https://github.com/openai.png","language":"Python","readme":"# improved-diffusion\n\nThis is the codebase for [Improved Denoising Diffusion Probabilistic Models](https://arxiv.org/abs/2102.09672).\n\n# Usage\n\nThis section of the README walks through how to train and sample from a model.\n\n## Installation\n\nClone this repository and navigate to it in your terminal. Then run:\n\n```\npip install -e .\n```\n\nThis should install the `improved_diffusion` python package that the scripts depend on.\n\n## Preparing Data\n\nThe training code reads images from a directory of image files. In the [datasets](datasets) folder, we have provided instructions/scripts for preparing these directories for ImageNet, LSUN bedrooms, and CIFAR-10.\n\nFor creating your own dataset, simply dump all of your images into a directory with \".jpg\", \".jpeg\", or \".png\" extensions. If you wish to train a class-conditional model, name the files like \"mylabel1_XXX.jpg\", \"mylabel2_YYY.jpg\", etc., so that the data loader knows that \"mylabel1\" and \"mylabel2\" are the labels. Subdirectories will automatically be enumerated as well, so the images can be organized into a recursive structure (although the directory names will be ignored, and the underscore prefixes are used as names).\n\nThe images will automatically be scaled and center-cropped by the data-loading pipeline. Simply pass `--data_dir path/to/images` to the training script, and it will take care of the rest.\n\n## Training\n\nTo train your model, you should first decide some hyperparameters. We will split up our hyperparameters into three groups: model architecture, diffusion process, and training flags. Here are some reasonable defaults for a baseline:\n\n```\nMODEL_FLAGS=\"--image_size 64 --num_channels 128 --num_res_blocks 3\"\nDIFFUSION_FLAGS=\"--diffusion_steps 4000 --noise_schedule linear\"\nTRAIN_FLAGS=\"--lr 1e-4 --batch_size 128\"\n```\n\nHere are some changes we experiment with, and how to set them in the flags:\n\n * **Learned sigmas:** add `--learn_sigma True` to `MODEL_FLAGS`\n * **Cosine schedule:** change `--noise_schedule linear` to `--noise_schedule cosine`\n * **Importance-sampled VLB:** add `--use_kl True` to `DIFFUSION_FLAGS` and add `--schedule_sampler loss-second-moment` to  `TRAIN_FLAGS`.\n * **Class-conditional:** add `--class_cond True` to `MODEL_FLAGS`.\n\nOnce you have setup your hyper-parameters, you can run an experiment like so:\n\n```\npython scripts/image_train.py --data_dir path/to/images $MODEL_FLAGS $DIFFUSION_FLAGS $TRAIN_FLAGS\n```\n\nYou may also want to train in a distributed manner. In this case, run the same command with `mpiexec`:\n\n```\nmpiexec -n $NUM_GPUS python scripts/image_train.py --data_dir path/to/images $MODEL_FLAGS $DIFFUSION_FLAGS $TRAIN_FLAGS\n```\n\nWhen training in a distributed manner, you must manually divide the `--batch_size` argument by the number of ranks. In lieu of distributed training, you may use `--microbatch 16` (or `--microbatch 1` in extreme memory-limited cases) to reduce memory usage.\n\nThe logs and saved models will be written to a logging directory determined by the `OPENAI_LOGDIR` environment variable. If it is not set, then a temporary directory will be created in `/tmp`.\n\n## Sampling\n\nThe above training script saves checkpoints to `.pt` files in the logging directory. These checkpoints will have names like `ema_0.9999_200000.pt` and `model200000.pt`. You will likely want to sample from the EMA models, since those produce much better samples.\n\nOnce you have a path to your model, you can generate a large batch of samples like so:\n\n```\npython scripts/image_sample.py --model_path /path/to/model.pt $MODEL_FLAGS $DIFFUSION_FLAGS\n```\n\nAgain, this will save results to a logging directory. Samples are saved as a large `npz` file, where `arr_0` in the file is a large batch of samples.\n\nJust like for training, you can run `image_sample.py` through MPI to use multiple GPUs and machines.\n\nYou can change the number of sampling steps using the `--timestep_respacing` argument. For example, `--timestep_respacing 250` uses 250 steps to sample. Passing `--timestep_respacing ddim250` is similar, but uses the uniform stride from the [DDIM paper](https://arxiv.org/abs/2010.02502) rather than our stride.\n\nTo sample using [DDIM](https://arxiv.org/abs/2010.02502), pass `--use_ddim True`.\n\n## Models and Hyperparameters\n\nThis section includes model checkpoints and run flags for the main models in the paper.\n\nNote that the batch sizes are specified for single-GPU training, even though most of these runs will not naturally fit on a single GPU. To address this, either set `--microbatch` to a small value (e.g. 4) to train on one GPU, or run with MPI and divide `--batch_size` by the number of GPUs.\n\nUnconditional ImageNet-64 with our `L_hybrid` objective and cosine noise schedule [[checkpoint](https://openaipublic.blob.core.windows.net/diffusion/march-2021/imagenet64_uncond_100M_1500K.pt)]:\n\n```bash\nMODEL_FLAGS=\"--image_size 64 --num_channels 128 --num_res_blocks 3 --learn_sigma True\"\nDIFFUSION_FLAGS=\"--diffusion_steps 4000 --noise_schedule cosine\"\nTRAIN_FLAGS=\"--lr 1e-4 --batch_size 128\"\n```\n\nUnconditional CIFAR-10 with our `L_hybrid` objective and cosine noise schedule [[checkpoint](https://openaipublic.blob.core.windows.net/diffusion/march-2021/cifar10_uncond_50M_500K.pt)]:\n\n```bash\nMODEL_FLAGS=\"--image_size 32 --num_channels 128 --num_res_blocks 3 --learn_sigma True --dropout 0.3\"\nDIFFUSION_FLAGS=\"--diffusion_steps 4000 --noise_schedule cosine\"\nTRAIN_FLAGS=\"--lr 1e-4 --batch_size 128\"\n```\n\nClass-conditional ImageNet-64 model (270M parameters, trained for 250K iterations) [[checkpoint](https://openaipublic.blob.core.windows.net/diffusion/march-2021/imagenet64_cond_270M_250K.pt)]:\n\n```bash\nMODEL_FLAGS=\"--image_size 64 --num_channels 192 --num_res_blocks 3 --learn_sigma True --class_cond True\"\nDIFFUSION_FLAGS=\"--diffusion_steps 4000 --noise_schedule cosine --rescale_learned_sigmas False --rescale_timesteps False\"\nTRAIN_FLAGS=\"--lr 3e-4 --batch_size 2048\"\n```\n\nUpsampling 256x256 model (280M parameters, trained for 500K iterations) [[checkpoint](https://openaipublic.blob.core.windows.net/diffusion/march-2021/upsample_cond_500K.pt)]:\n\n```bash\nMODEL_FLAGS=\"--num_channels 192 --num_res_blocks 2 --learn_sigma True --class_cond True\"\nDIFFUSION_FLAGS=\"--diffusion_steps 4000 --noise_schedule linear --rescale_learned_sigmas False --rescale_timesteps False\"\nTRAIN_FLAGS=\"--lr 3e-4 --batch_size 256\"\n```\n\nLSUN bedroom model (lr=1e-4) [[checkpoint](https://openaipublic.blob.core.windows.net/diffusion/march-2021/lsun_uncond_100M_1200K_bs128.pt)]:\n\n```bash\nMODEL_FLAGS=\"--image_size 256 --num_channels 128 --num_res_blocks 2 --num_heads 1 --learn_sigma True --use_scale_shift_norm False --attention_resolutions 16\"\nDIFFUSION_FLAGS=\"--diffusion_steps 1000 --noise_schedule linear --rescale_learned_sigmas False --rescale_timesteps False\"\nTRAIN_FLAGS=\"--lr 1e-4 --batch_size 128\"\n```\n\nLSUN bedroom model (lr=2e-5) [[checkpoint](https://openaipublic.blob.core.windows.net/diffusion/march-2021/lsun_uncond_100M_2400K_bs64.pt)]:\n\n```bash\nMODEL_FLAGS=\"--image_size 256 --num_channels 128 --num_res_blocks 2 --num_heads 1 --learn_sigma True --use_scale_shift_norm False --attention_resolutions 16\"\nDIFFUSION_FLAGS=\"--diffusion_steps 1000 --noise_schedule linear --rescale_learned_sigmas False --rescale_timesteps False --use_scale_shift_norm False\"\nTRAIN_FLAGS=\"--lr 2e-5 --batch_size 128\"\n```\n\nUnconditional ImageNet-64 with the `L_vlb` objective and cosine noise schedule [[checkpoint](https://openaipublic.blob.core.windows.net/diffusion/march-2021/imagenet64_uncond_vlb_100M_1500K.pt)]:\n\n```bash\nMODEL_FLAGS=\"--image_size 64 --num_channels 128 --num_res_blocks 3 --learn_sigma True\"\nDIFFUSION_FLAGS=\"--diffusion_steps 4000 --noise_schedule cosine --use_kl True\"\nTRAIN_FLAGS=\"--lr 1e-4 --batch_size 128 --schedule_sampler loss-second-moment\"\n```\n\nUnconditional CIFAR-10 with the `L_vlb` objective and cosine noise schedule [[checkpoint](https://openaipublic.blob.core.windows.net/diffusion/march-2021/cifar10_uncond_vlb_50M_500K.pt)]:\n\n```bash\nMODEL_FLAGS=\"--image_size 32 --num_channels 128 --num_res_blocks 3 --learn_sigma True --dropout 0.3\"\nDIFFUSION_FLAGS=\"--diffusion_steps 4000 --noise_schedule cosine --use_kl True\"\nTRAIN_FLAGS=\"--lr 1e-4 --batch_size 128 --schedule_sampler loss-second-moment\"\n```\n","funding_links":[],"categories":["Papers","Generative Model"],"sub_categories":["Diffusion Model"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopenai%2Fimproved-diffusion","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fopenai%2Fimproved-diffusion","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopenai%2Fimproved-diffusion/lists"}