{"id":22675581,"url":"https://github.com/stability-ai/sd3.5","last_synced_at":"2025-05-14T23:00:20.726Z","repository":{"id":259248209,"uuid":"869842392","full_name":"Stability-AI/sd3.5","owner":"Stability-AI","description":null,"archived":false,"fork":false,"pushed_at":"2025-01-08T14:05:05.000Z","size":876,"stargazers_count":1116,"open_issues_count":21,"forks_count":97,"subscribers_count":16,"default_branch":"main","last_synced_at":"2025-04-08T17:16:16.549Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Stability-AI.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE-CODE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-10-09T01:40:18.000Z","updated_at":"2025-04-08T13:11:54.000Z","dependencies_parsed_at":"2024-11-19T00:27:09.566Z","dependency_job_id":"37572cdc-7724-4158-8e16-bf4636fcc064","html_url":"https://github.com/Stability-AI/sd3.5","commit_stats":null,"previous_names":["stability-ai/sd3.5"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Stability-AI%2Fsd3.5","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Stability-AI%2Fsd3.5/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Stability-AI%2Fsd3.5/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Stability-AI%2Fsd3.5/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Stability-AI","download_url":"https://codeload.github.com/Stability-AI/sd3.5/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254243353,"owners_count":22038044,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-09T17:57:40.507Z","updated_at":"2025-05-14T23:00:20.670Z","avatar_url":"https://github.com/Stability-AI.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Stable Diffusion 3.5\n\nInference-only tiny reference implementation of SD3.5 and SD3 - everything you need for simple inference using SD3.5/SD3, as well as the SD3.5 Large ControlNets, excluding the weights files.\n\nContains code for the text encoders (OpenAI CLIP-L/14, OpenCLIP bigG, Google T5-XXL) (these models are all public), the VAE Decoder (similar to previous SD models, but 16-channels and no postquantconv step), and the core MM-DiT (entirely new).\n\nNote: this repo is a reference library meant to assist partner organizations in implementing SD3.5/SD3. For alternate inference, use [Comfy](https://github.com/comfyanonymous/ComfyUI).\n\n## Updates\n\n- Nov 26, 2024 : Released ControlNets for SD3.5-Large.\n- Oct 29, 2024 : Released inference code for SD3.5-Medium.\n- Oct 24, 2024 : Updated code license to MIT License.\n- Oct 22, 2024 : Released inference code for SD3.5-Large, Large-Turbo. Also works on SD3-Medium.\n\n## Download\n\nDownload the following models from HuggingFace into `models` directory:\n1. [Stability AI SD3.5 Large](https://huggingface.co/stabilityai/stable-diffusion-3.5-large/blob/main/sd3.5_large.safetensors) or [Stability AI SD3.5 Large Turbo](https://huggingface.co/stabilityai/stable-diffusion-3.5-large-turbo/blob/main/sd3.5_large_turbo.safetensors) or [Stability AI SD3.5 Medium](https://huggingface.co/stabilityai/stable-diffusion-3.5-medium/blob/main/sd3.5_medium.safetensors)\n2. [OpenAI CLIP-L](https://huggingface.co/stabilityai/stable-diffusion-3.5-large/blob/main/text_encoders/clip_l.safetensors)\n3. [OpenCLIP bigG](https://huggingface.co/stabilityai/stable-diffusion-3.5-large/blob/main/text_encoders/clip_g.safetensors)\n4. [Google T5-XXL](https://huggingface.co/stabilityai/stable-diffusion-3.5-large/blob/main/text_encoders/t5xxl_fp16.safetensors)\n\nThis code also works for [Stability AI SD3 Medium](https://huggingface.co/stabilityai/stable-diffusion-3-medium/blob/main/sd3_medium.safetensors).\n\n### ControlNets\n\nOptionally, download [SD3.5 Large ControlNets](https://huggingface.co/stabilityai/stable-diffusion-3.5-controlnets):\n- [Blur ControlNet](https://huggingface.co/stabilityai/stable-diffusion-3.5-controlnets/resolve/main/blur_8b.safetensors)\n- [Canny ControlNet](https://huggingface.co/stabilityai/stable-diffusion-3.5-controlnets/resolve/main/canny_8b.safetensors)\n- [Depth ControlNet](https://huggingface.co/stabilityai/stable-diffusion-3.5-controlnets/resolve/main/depth_8b.safetensors)\n\n```py\nfrom huggingface_hub import hf_hub_download\nhf_hub_download(\"stabilityai/stable-diffusion-3.5-controlnets\", \"sd3.5_large_controlnet_blur.safetensors\", local_dir=\"models\")\nhf_hub_download(\"stabilityai/stable-diffusion-3.5-controlnets\", \"sd3.5_large_controlnet_canny.safetensors\", local_dir=\"models\")\nhf_hub_download(\"stabilityai/stable-diffusion-3.5-controlnets\", \"sd3.5_large_controlnet_depth.safetensors\", local_dir=\"models\")\n```\n\n## Install\n\n```sh\n# Note: on windows use \"python\" not \"python3\"\npython3 -s -m venv .sd3.5\nsource .sd3.5/bin/activate\n# or on windows: venv/scripts/activate\npython3 -s -m pip install -r requirements.txt\n```\n\n## Run\n\n```sh\n# Generate a cat using SD3.5 Large model (at models/sd3.5_large.safetensors) with its default settings\npython3 sd3_infer.py --prompt \"cute wallpaper art of a cat\"\n# Or use a text file with a list of prompts, using SD3.5 Large\npython3 sd3_infer.py --prompt path/to/my_prompts.txt --model models/sd3.5_large.safetensors\n# Generate from prompt file using SD3.5 Large Turbo with its default settings\npython3 sd3_infer.py --prompt path/to/my_prompts.txt --model models/sd3.5_large_turbo.safetensors\n# Generate from prompt file using SD3.5 Medium with its default settings, at 2k resolution\npython3 sd3_infer.py --prompt path/to/my_prompts.txt --model models/sd3.5_medium.safetensors --width 1920 --height 1080\n# Generate from prompt file using SD3 Medium with its default settings\npython3 sd3_infer.py --prompt path/to/my_prompts.txt --model models/sd3_medium.safetensors\n```\n\nImages will be output to `outputs/\u003cMODEL\u003e/\u003cPROMPT\u003e_\u003cDATETIME\u003e_\u003cPOSTFIX\u003e` by default.\nTo add a postfix to the output directory, add `--postfix \u003cmy_postfix\u003e`. For example,\n```sh\npython3 sd3_infer.py --prompt path/to/my_prompts.txt --postfix \"steps100\" --steps 100\n```\n\nTo change the resolution of the generated image, add `--width \u003cWIDTH\u003e --height \u003cHEIGHT\u003e`.\n\nOptionally, use [Skip Layer Guidance](https://github.com/comfyanonymous/ComfyUI/pull/5404) for potentially better struture and anatomy coherency from SD3.5-Medium.\n```sh\npython3 sd3_infer.py --prompt path/to/my_prompts.txt --model models/sd3.5_medium.safetensors --skip_layer_cfg True\n```\n\n### ControlNets\n\nTo use SD3.5 Large ControlNets, additionally download your chosen ControlNet model from the [model repository](https://huggingface.co/stabilityai/stable-diffusion-3.5-controlnets), then run inference, like so:\n- Blur:\n```sh\npython sd3_infer.py --model models/sd3.5_large.safetensors --controlnet_ckpt models/sd3.5_large_controlnet_blur.safetensors --controlnet_cond_image inputs/blur.png --prompt \"generated ai art, a tiny, lost rubber ducky in an action shot close-up, surfing the humongous waves, inside the tube, in the style of Kelly Slater\"\n```\n- Canny:\n```sh\npython sd3_infer.py --model models/sd3.5_large.safetensors --controlnet_ckpt models/sd3.5_large_controlnet_canny.safetensors --controlnet_cond_image inputs/canny.png --prompt \"A Night time photo taken by Leica M11, portrait of a Japanese woman in a kimono, looking at the camera, Cherry blossoms\"\n```\n- Depth:\n```sh\npython sd3_infer.py --model models/sd3.5_large.safetensors --controlnet_ckpt models/sd3.5_large_controlnet_depth.safetensors --controlnet_cond_image inputs/depth.png --prompt \"photo of woman, presumably in her mid-thirties, striking a balanced yoga pose on a rocky outcrop during dusk or dawn. She wears a light gray t-shirt and dark leggings. Her pose is dynamic, with one leg extended backward and the other bent at the knee, holding the moon close to her hand.\"\n```\n\nFor details on preprocessing for each of the ControlNets, and examples, please review the [model card](https://huggingface.co/stabilityai/stable-diffusion-3.5-controlnets).\n\n## File Guide\n\n- `sd3_infer.py` - entry point, review this for basic usage of diffusion model\n- `sd3_impls.py` - contains the wrapper around the MMDiTX and the VAE\n- `other_impls.py` - contains the CLIP models, the T5 model, and some utilities\n- `mmditx.py` - contains the core of the MMDiT-X itself\n- folder `models` with the following files (download separately):\n    - `clip_l.safetensors` (OpenAI CLIP-L, same as SDXL/SD3, can grab a public copy)\n    - `clip_g.safetensors` (openclip bigG, same as SDXL/SD3, can grab a public copy)\n    - `t5xxl.safetensors` (google T5-v1.1-XXL, can grab a public copy)\n    - `sd3.5_large.safetensors` or `sd3.5_large_turbo.safetensors` or `sd3.5_medium.safetensors` (or `sd3_medium.safetensors`)\n\n## Code Origin\n\nThe code included here originates from:\n- Stability AI internal research code repository (MM-DiT)\n- Public Stability AI repositories (eg VAE)\n- Some unique code for this reference repo written by Alex Goodwin and Vikram Voleti for Stability AI\n- Some code from ComfyUI internal Stability implementation of SD3 (for some code corrections and handlers)\n- HuggingFace and upstream providers (for sections of CLIP/T5 code)\n\n## Legal\n\nCheck the LICENSE-CODE file.\n\n### Note\n\nSome code in `other_impls` originates from HuggingFace and is subject to [the HuggingFace Transformers Apache2 License](https://github.com/huggingface/transformers/blob/main/LICENSE)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstability-ai%2Fsd3.5","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fstability-ai%2Fsd3.5","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstability-ai%2Fsd3.5/lists"}