{"id":13778836,"url":"https://github.com/ozanciga/diffusion-for-beginners","last_synced_at":"2025-05-11T12:32:02.697Z","repository":{"id":154459782,"uuid":"546934051","full_name":"ozanciga/diffusion-for-beginners","owner":"ozanciga","description":"denoising diffusion models, as simple as possible","archived":false,"fork":false,"pushed_at":"2022-11-08T11:43:02.000Z","size":399,"stargazers_count":153,"open_issues_count":0,"forks_count":12,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-08-03T18:13:25.175Z","etag":null,"topics":["dall-e","diffusion","imagen","midjourney","pytorch","scheduler","stable-diffusion"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ozanciga.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2022-10-06T22:19:16.000Z","updated_at":"2024-07-30T02:06:24.000Z","dependencies_parsed_at":"2024-01-16T08:12:08.619Z","dependency_job_id":"3ec48c71-3fa8-4d28-8346-7bc934f25821","html_url":"https://github.com/ozanciga/diffusion-for-beginners","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ozanciga%2Fdiffusion-for-beginners","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ozanciga%2Fdiffusion-for-beginners/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ozanciga%2Fdiffusion-for-beginners/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ozanciga%2Fdiffusion-for-beginners/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ozanciga","download_url":"https://codeload.github.com/ozanciga/diffusion-for-beginners/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225048995,"owners_count":17412907,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dall-e","diffusion","imagen","midjourney","pytorch","scheduler","stable-diffusion"],"created_at":"2024-08-03T18:00:58.026Z","updated_at":"2024-11-17T14:30:59.380Z","avatar_url":"https://github.com/ozanciga.png","language":"Python","readme":"# diffusion for beginners\n\n- implementation of _diffusion schedulers_ with minimal code \u0026 as faithful to the original work as i could. most recent work reuse or adopt code from previous work and build on it, or transcribe code from another framework - which is great! but i found it hard to follow at times. this is an attempt at simplifying below great papers. the trade-off is made between stability and correctness vs. brevity and simplicity.\n\n\n$$\\large{\\mathbf{{\\color{green}feel\\ free\\ to\\ contribute\\ to\\ the\\ list\\ below!}}}$$\n\n- [x] [dpm-solver++(2m)](samplers/dpm_solver_plus_plus.py) (lu et al. 2022), dpm-solver++: fast solver for guided sampling of diffusion probabilistic models, https://arxiv.org/abs/2211.01095\n- [x] [exponential integrator](samplers/exponential_integrator.py) (zhang et al. 2022), fast sampling of diffusion models with exponential integrator, https://arxiv.org/abs/2204.13902\n- [x] [dpm-solver](samplers/dpm_solver.py) (lu et al. 2022), dpm-solver: a fast ode solver for diffusion probabilistic model sampling in around 10 steps, https://arxiv.org/abs/2206.00927\n- [x] [heun](samplers/heun.py) (karras et al. 2020), elucidating the design space of diffusion-based generative models, https://arxiv.org/abs/2206.00364\n- [x] [pndm](samplers/pndm.py) (ho et al. 2020), pseudo numerical methods for diffusion models, https://arxiv.org/abs/2202.09778\n- [x] [ddim](samplers/ddim.py) (song et al. 2020), denoising diffusion implicit models, https://arxiv.org/abs/2010.02502\n- [x] [improved ddpm](samplers/improved_ddpm.py) (nichol and dhariwal 2021), improved denoising diffusion probabilistic models,https://arxiv.org/abs/2102.09672\n- [x] [ddpm](samplers/ddpm.py) (ho et al. 2020), denoising diffusion probabilistic models, https://arxiv.org/abs/2006.11239\n\n\n\n**prompt**: \"a man eating an apple sitting on a bench\"\n\n\n\u003ctable\u003e\n \u003ctr\u003e\n    \u003ctd\u003e\u003cimg src='images/dpmsolverplusplus.jpg' height=\"256\" width=\"256\"\u003e\u003c/td\u003e\n    \u003ctd\u003e\u003cimg src='images/exponential_integrator.jpg' height=\"256\" width=\"256\"\u003e\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n   \u003ctd\u003e\u003cb style=\"font-size:20px\"\u003edpm-solver++\u003c/b\u003e\u003c/td\u003e\n   \u003ctd\u003e\u003cb style=\"font-size:20px\"\u003eexponential integrator\u003c/b\u003e\u003c/td\u003e\n \u003c/tr\u003e\n\u003c/table\u003e\n\n\n\n\u003ctable\u003e\n \u003ctr\u003e\n    \u003ctd\u003e\u003cimg src='images/heun.jpg' height=\"256\" width=\"256\"\u003e\u003c/td\u003e\n    \u003ctd\u003e\u003cimg src='images/dpm_solver_2.jpg' height=\"256\" width=\"256\"\u003e\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n   \u003ctd\u003e\u003cb style=\"font-size:20px\"\u003eheun\u003c/b\u003e\u003c/td\u003e\n   \u003ctd\u003e\u003cb style=\"font-size:20px\"\u003edpm-solver\u003c/b\u003e\u003c/td\u003e\n \u003c/tr\u003e\n\u003c/table\u003e\n\n\u003ctable\u003e\n \u003ctr\u003e\n    \u003ctd\u003e\u003cimg src='images/ddim.jpg' height=\"256\" width=\"256\"\u003e\u003c/td\u003e\n    \u003ctd\u003e\u003cimg src='images/pndm.jpg' height=\"256\" width=\"256\"\u003e\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n   \u003ctd\u003e\u003cb style=\"font-size:20px\"\u003eddim\u003c/b\u003e\u003c/td\u003e\n   \u003ctd\u003e\u003cb style=\"font-size:20px\"\u003epndm\u003c/b\u003e\u003c/td\u003e\n \u003c/tr\u003e\n\u003c/table\u003e\n\n\n\u003ctable\u003e\n \u003ctr\u003e\n    \u003ctd\u003e\u003cimg src=\"images/ddpm.jpg\" height=\"256\" width=\"256\"\u003e\u003c/td\u003e\n    \u003ctd\u003e\u003cimg src='images/improved_ddpm.jpg' height=\"256\" width=\"256\"\u003e\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n   \u003ctd\u003e\u003cb style=\"font-size:20px\"\u003eddpm\u003c/b\u003e\u003c/td\u003e\n   \u003ctd\u003e\u003cb style=\"font-size:20px\"\u003eimproved ddpm\u003c/b\u003e\u003c/td\u003e\n \u003c/tr\u003e\n\u003c/table\u003e\n\n### * requirements *\nwhile this repository is intended to be educational, if you wish to run and experiment, you'll need to obtain a [token from huggingface](https://huggingface.co/docs/hub/security-tokens) (and paste it to generate_sample.py), and install their excellent [diffusers library](https://github.com/huggingface/diffusers)\n\n\n### * modification for heun sampler * \nheun sampler uses two neural function evaluations per step, and modifies the input as well as the sigma. i wanted to be as faithful to the paper as much as possible, which necessitated changing the sampling code a little.\ninitiate the sampler as:\n```python\nsampler = HeunSampler(num_sample_steps=25, denoiser=pipe.unet, alpha_bar=pipe.scheduler.alphas_cumprod)\ninit_latents = torch.randn(batch_size, 4, 64, 64).to(device) * sampler.t0\n```\n\nand replace the inner loop for generate_sample.py as:\n```python\nfor t in tqdm(sampler.timesteps):\n    latents = sampler(latents, t, text_embeddings, guidance_scale)\n```\n\nsimilarly, for dpm-solver-2, \n\n```python\n    sampler = DPMSampler(num_sample_steps=20, denoiser=pipe.unet)\n    init_latents = torch.randn(batch_size, 4, 64, 64).to(device) * sampler.lmbd(1)[1]\n```\n\nand, for fast exponential integrator,\n\n```python\n    sampler = ExponentialSampler(num_sample_steps=50, denoiser=pipe.unet)\n    init_latents = torch.randn(batch_size, 4, 64, 64).to(device)\n```\n\nand, for dpm-solver++ (2m),\n\n```python\n    sampler = DPMPlusPlusSampler(denoiser=pipe.unet, num_sample_steps=20)\n    init_latents = torch.randn(batch_size, 4, 64, 64).to(device) * sampler.get_coeffs(sampler.t[0])[1]\n```\n\n## soft-diffusion\n\na sketch/draft of google's new paper, [soft diffusion: score matching for general corruptions](https://arxiv.org/abs/2209.05442), which achieves state-of-the-art results on celeba-64 dataset.\n\ndetails can be found [here](soft_diffusion)","funding_links":[],"categories":["Tutorial and Jupyter Notebook"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fozanciga%2Fdiffusion-for-beginners","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fozanciga%2Fdiffusion-for-beginners","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fozanciga%2Fdiffusion-for-beginners/lists"}