https://github.com/teriks/dgenerate
dgenerate is a scriptable command line tool (and library) for generating images and animation sequences using stable diffusion and related techniques, with an accompanying GUI scripting environment.
https://github.com/teriks/dgenerate
ai ai-upscale command-line command-line-tool cross-platform gui-application image-editing image-generation image-processing scriptable stable-diffusion upscaling video-editing video-generation video-processing
Last synced: 12 months ago
JSON representation
dgenerate is a scriptable command line tool (and library) for generating images and animation sequences using stable diffusion and related techniques, with an accompanying GUI scripting environment.
- Host: GitHub
- URL: https://github.com/teriks/dgenerate
- Owner: Teriks
- License: bsd-3-clause
- Created: 2023-06-14T02:10:53.000Z (almost 3 years ago)
- Default Branch: master
- Last Pushed: 2025-06-20T06:47:49.000Z (12 months ago)
- Last Synced: 2025-06-20T07:39:22.457Z (12 months ago)
- Topics: ai, ai-upscale, command-line, command-line-tool, cross-platform, gui-application, image-editing, image-generation, image-processing, scriptable, stable-diffusion, upscaling, video-editing, video-generation, video-processing
- Language: Python
- Homepage: https://dgenerate.readthedocs.io
- Size: 157 MB
- Stars: 36
- Watchers: 5
- Forks: 1
- Open Issues: 17
-
Metadata Files:
- Readme: README.rst
- Funding: .github/FUNDING.yml
- License: LICENSE
Awesome Lists containing this project
README
.. _homebrew_1: https://brew.sh/
.. _optimum-quanto_library_1: https://github.com/huggingface/optimum-quanto
.. _vermeer_canny_edged.png_1: https://raw.githubusercontent.com/Teriks/dgenerate/v4.5.1/examples/media/vermeer_canny_edged.png
.. _spandrel_1: https://github.com/chaiNNer-org/spandrel
.. _ncnn_1: https://github.com/Tencent/ncnn
.. _Stable_Diffusion_Web_UI_1: https://github.com/AUTOMATIC1111/stable-diffusion-webui
.. _CivitAI_1: https://civitai.com/
.. _chaiNNer_1: https://github.com/chaiNNer-org/chaiNNer
.. |Documentation| image:: https://readthedocs.org/projects/dgenerate/badge/?version=v4.5.1
:target: http://dgenerate.readthedocs.io/en/v4.5.1/
.. |Latest Release| image:: https://img.shields.io/github/v/release/Teriks/dgenerate
:target: https://github.com/Teriks/dgenerate/releases/latest
:alt: GitHub Latest Release
.. |Support Dgenerate| image:: https://img.shields.io/badge/Ko–fi-support%20dgenerate%20-hotpink?logo=kofi&logoColor=white
:target: https://ko-fi.com/teriks
:alt: ko-fi
Overview
========
**See here for v5.0.0 dev branch:** https://github.com/Teriks/dgenerate/tree/version_5.0.0
**See here for v5.0.0 nightlys:** https://github.com/Teriks/dgenerate/releases/tag/pre-release
----
|Documentation| |Latest Release| |Support Dgenerate|
``dgenerate`` is a cross-platform command line tool and library for generating images
and animation sequences using Stable Diffusion and related models.
Alongside the command line tool, this project features a syntax-highlighting
REPL `Console UI`_ for the dgenerate configuration / scripting language, which is built on
Tkinter to be lightweight and portable. This GUI serves as an interface to dgenerate running
in the background via the ``--shell`` option.
You can use dgenerate to generate multiple images or animated outputs using multiple
combinations of diffusion input parameters in batch, so that the differences in
generated output can be compared / curated easily. This can be accomplished via a single command,
or through more advanced scripting with the built-in interpreted shell-like language if needed.
Animated output can be produced by processing every frame of a Video, GIF, WebP, or APNG through
various implementations of diffusion in img2img or inpainting mode, as well as with ControlNets and
control guidance images, in any combination thereof. MP4 (h264) video can be written without memory
constraints related to frame count. GIF, WebP, and PNG/APNG can be written WITH memory constraints,
IE: all frames exist in memory at once before being written.
Video input of any runtime can be processed without memory constraints related to the video size.
Many video formats are supported through the use of PyAV (ffmpeg).
Animated image input such as GIF, APNG (extension must be .apng), and WebP, can also be processed
WITH memory constraints, IE: all frames exist in memory at once after an animated image is read.
PNG, JPEG, JPEG-2000, TGA (Targa), BMP, and PSD (Photoshop) are supported for static image inputs.
In addition to diffusion, dgenerate also supports the processing of any supported image, video, or
animated image using any of its built-in image processors, which include various edge detectors,
depth detectors, segment generation, normal map generation, pose detection, non-diffusion based
AI upscaling, and more. dgenerate's image processors may be used to pre-process image / video
input to diffusion, post-process diffusion output, or to process images and video directly.
dgenerate brings many major features of the HuggingFace ``diffusers`` library directly to the
command line in a very flexible way with a near one-to-one mapping, akin to ffmpeg, allowing
for creative uses as powerful as direct implementation in python with less effort and
environmental setup.
dgenerate is compatible with HuggingFace as well as typical CivitAI-hosted models,
prompt weighting and many other useful generation features are supported.
dgenerate can be easily installed on Windows via a Windows Installer MSI containing a
frozen python environment, making setup for Windows users easy, and likely to "just work"
without any dependency issues. This installer can be found in the release artifact under each
release located on the `github releases page `_.
This software requires a Nvidia GPU supporting CUDA 12.1+, AMD GPU supporting ROCm (Linux Only),
or MacOS on Apple Silicon, and supports ``python>=3.10,<3.13``. CPU rendering is possible for
some operations but extraordinarily slow.
For library documentation, and a better README reading experience which
includes proper syntax highlighting for examples, and side panel navigation,
please visit `readthedocs `_.
----
* `Help Output`_
* `Diffusion Feature Table `_
* How to install
* `Windows Install`_
* `Linux or WSL Install`_
* `Linux with ROCm (AMD Cards)`_
* `MacOS Install (Apple Silicon Only)`_
* `Google Colab Install`_
* Usage Manual
* `Basic Usage`_
* `Negative Prompt`_
* `Multiple Prompts`_
* `Image Seeds`_
* `Inpainting`_
* `Per Image Seed Resizing`_
* `Animated Output`_
* `Animation Slicing`_
* `Inpainting Animations`_
* `Deterministic Output`_
* `Specifying a specific GPU for CUDA`_
* `Specifying a Scheduler (sampler)`_
* `Specifying a VAE`_
* `VAE Tiling and Slicing`_
* `Specifying a UNet`_
* `Specifying a Transformer (SD3 and Flux)`_
* `Specifying an SDXL Refiner`_
* `Specifying a Stable Cascade Decoder`_
* `Specifying LoRAs`_
* `Specifying IP Adapters`_
* `basic --image-seeds specification`_
* `img2img --image-seeds specification`_
* `inpainting --image-seeds specification`_
* `quoting IP Adapter image URLs with plus symbols`_
* `animated inputs & combinatorics`_
* `Specifying Textual Inversions (embeddings)`_
* `Specifying Control Nets`_
* `Flux Control Net Union Mode`_
* `Specifying T2I Adapters`_
* `Specifying Text Encoders`_
* `Prompt Weighting and Enhancement`_
* `The compel prompt weighter`_
* `The sd-embed prompt weighter`_
* `Utilizing CivitAI links and Other Hosted Models`_
* `Specifying Generation Batch Size`_
* `Batching Input Images and Inpaint Masks`_
* `Image Processors`_
* `Image processor arguments`_
* `Multiple control net images, and input image batching`_
* `Sub Commands`_
* `Sub Command: image-process`_
* `Sub Command: civitai-links`_
* `Upscaling`_
* `Upscaling with Diffusion Upscaler Models`_
* `Upscaling with chaiNNer Compatible Torch Upscaler Models`_
* `Upscaling with NCNN Upscaler Models`_
* `Writing and Running Configs`_
* `Basic config syntax`_
* `Built in template variables and functions`_
* `Directives, and applying templating`_
* `Setting template variables, in depth`_
* `Setting environmental variables, in depth`_
* `Globbing and path manipulation`_
* `The \\print and \\echo directive`_
* `The \\image_process directive`_
* `The \\exec directive`_
* `The \\download directive`_
* `The download() template function`_
* `The \\exit directive`_
* `Running configs from the command line`_
* `Config argument injection`_
* `Writing Plugins`_
* `Image processor plugins`_
* `Config directive and template function plugins`_
* `Sub-command plugins`_
* `Prompt weighter plugins`_
* `Console UI`_
* `File Cache Control`_
Help Output
===========
.. code-block:: text
usage: dgenerate [-h] [-v] [--version] [--file | --shell | --no-stdin | --console]
[--plugin-modules PATH [PATH ...]] [--sub-command SUB_COMMAND]
[--sub-command-help [SUB_COMMAND ...]] [-ofm] [--templates-help [VARIABLE_NAME ...]]
[--directives-help [DIRECTIVE_NAME ...]] [--functions-help [FUNCTION_NAME ...]]
[-mt MODEL_TYPE] [-rev BRANCH] [-var VARIANT] [-sbf SUBFOLDER] [-atk TOKEN] [-bs INTEGER]
[-bgs SIZE] [-te TEXT_ENCODER_URIS [TEXT_ENCODER_URIS ...]]
[-te2 TEXT_ENCODER_URIS [TEXT_ENCODER_URIS ...]] [-un UNET_URI] [-un2 UNET_URI]
[-tf TRANSFORMER_URI] [-vae VAE_URI] [-vt] [-vs] [-lra LORA_URI [LORA_URI ...]]
[-lrfs LORA_FUSE_SCALE] [-ie IMAGE_ENCODER_URI] [-ipa IP_ADAPTER_URI [IP_ADAPTER_URI ...]]
[-ti URI [URI ...]] [-cn CONTROLNET_URI [CONTROLNET_URI ...] | -t2i T2I_ADAPTER_URI
[T2I_ADAPTER_URI ...]] [-sch SCHEDULER_URI [SCHEDULER_URI ...]] [-pag]
[-pags FLOAT [FLOAT ...]] [-pagas FLOAT [FLOAT ...]] [-rpag] [-rpags FLOAT [FLOAT ...]]
[-rpagas FLOAT [FLOAT ...]] [-mqo | -mco] [--s-cascade-decoder MODEL_URI] [-dqo] [-dco]
[--s-cascade-decoder-prompts PROMPT [PROMPT ...]]
[--s-cascade-decoder-inference-steps INTEGER [INTEGER ...]]
[--s-cascade-decoder-guidance-scales INTEGER [INTEGER ...]]
[--s-cascade-decoder-scheduler SCHEDULER_URI [SCHEDULER_URI ...]]
[--sdxl-refiner MODEL_URI] [-rqo] [-rco]
[--sdxl-refiner-scheduler SCHEDULER_URI [SCHEDULER_URI ...]] [--sdxl-refiner-edit]
[--sdxl-second-prompts PROMPT [PROMPT ...]] [--sdxl-t2i-adapter-factors FLOAT [FLOAT ...]]
[--sdxl-aesthetic-scores FLOAT [FLOAT ...]]
[--sdxl-crops-coords-top-left COORD [COORD ...]] [--sdxl-original-size SIZE [SIZE ...]]
[--sdxl-target-size SIZE [SIZE ...]] [--sdxl-negative-aesthetic-scores FLOAT [FLOAT ...]]
[--sdxl-negative-original-sizes SIZE [SIZE ...]]
[--sdxl-negative-target-sizes SIZE [SIZE ...]]
[--sdxl-negative-crops-coords-top-left COORD [COORD ...]]
[--sdxl-refiner-prompts PROMPT [PROMPT ...]]
[--sdxl-refiner-clip-skips INTEGER [INTEGER ...]]
[--sdxl-refiner-second-prompts PROMPT [PROMPT ...]]
[--sdxl-refiner-aesthetic-scores FLOAT [FLOAT ...]]
[--sdxl-refiner-crops-coords-top-left COORD [COORD ...]]
[--sdxl-refiner-original-sizes SIZE [SIZE ...]]
[--sdxl-refiner-target-sizes SIZE [SIZE ...]]
[--sdxl-refiner-negative-aesthetic-scores FLOAT [FLOAT ...]]
[--sdxl-refiner-negative-original-sizes SIZE [SIZE ...]]
[--sdxl-refiner-negative-target-sizes SIZE [SIZE ...]]
[--sdxl-refiner-negative-crops-coords-top-left COORD [COORD ...]] [-hnf FLOAT [FLOAT ...]]
[-ri INT [INT ...]] [-rg FLOAT [FLOAT ...]] [-rgr FLOAT [FLOAT ...]] [-sc] [-d DEVICE]
[-t DTYPE] [-s SIZE] [-na] [-o PATH] [-op PREFIX] [-ox] [-oc] [-om]
[-pw PROMPT_WEIGHTER_URI] [--prompt-weighter-help [PROMPT_WEIGHTER_NAMES ...]]
[-p PROMPT [PROMPT ...]] [--sd3-max-sequence-length INTEGER]
[--sd3-second-prompts PROMPT [PROMPT ...]] [--sd3-third-prompts PROMPT [PROMPT ...]]
[--flux-second-prompts PROMPT [PROMPT ...]] [--flux-max-sequence-length INTEGER]
[-cs INTEGER [INTEGER ...]] [-se SEED [SEED ...]] [-sei] [-gse COUNT] [-af FORMAT]
[-if FORMAT] [-nf] [-fs FRAME_NUMBER] [-fe FRAME_NUMBER] [-is SEED [SEED ...]]
[-sip PROCESSOR_URI [PROCESSOR_URI ...]] [-mip PROCESSOR_URI [PROCESSOR_URI ...]]
[-cip PROCESSOR_URI [PROCESSOR_URI ...]] [--image-processor-help [PROCESSOR_NAME ...]]
[-pp PROCESSOR_URI [PROCESSOR_URI ...]] [-iss FLOAT [FLOAT ...] | -uns INTEGER
[INTEGER ...]] [-gs FLOAT [FLOAT ...]] [-igs FLOAT [FLOAT ...]] [-gr FLOAT [FLOAT ...]]
[-ifs INTEGER [INTEGER ...]] [-mc EXPR [EXPR ...]] [-pmc EXPR [EXPR ...]]
[-umc EXPR [EXPR ...]] [-vmc EXPR [EXPR ...]] [-cmc EXPR [EXPR ...]] [-tmc EXPR [EXPR ...]]
[-iemc EXPR [EXPR ...]] [-amc EXPR [EXPR ...]] [-tfmc EXPR [EXPR ...]]
[-ipmc EXPR [EXPR ...]] [-ipcc EXPR [EXPR ...]]
model_path
Batch image generation and manipulation tool supporting Stable Diffusion and related techniques /
algorithms, with support for video and animated image processing.
positional arguments:
model_path Hugging Face model repository slug, Hugging Face blob link to a model file, path to
folder on disk, or path to a .pt, .pth, .bin, .ckpt, or .safetensors file.
--------------------------------------------------------------------------
options:
-h, --help show this help message and exit
-------------------------------
-v, --verbose Output information useful for debugging, such as pipeline call and model load
parameters.
-----------
--version Show dgenerate's version and exit
---------------------------------
--file Convenience argument for reading a configuration script from a file instead of using
a pipe. This is a meta argument which can not be used within a configuration script
and is only valid from the command line or during a popen invocation of dgenerate.
----------------------------------------------------------------------------------
--shell When reading configuration from STDIN (a pipe), read forever, even when
configuration errors occur. This allows dgenerate to run in the background and be
controlled by another process sending commands. Launching dgenerate with this option
and not piping it input will attach it to the terminal like a shell. Entering
configuration into this shell requires two newlines to submit a command due to
parsing lookahead. IE: two presses of the enter key. This is a meta argument which
can not be used within a configuration script and is only valid from the command
line or during a popen invocation of dgenerate.
-----------------------------------------------
--no-stdin Can be used to indicate to dgenerate that it will not receive any piped in input.
This is useful for running dgenerate via popen from Python or another application
using normal arguments, where it would otherwise try to read from STDIN and block
forever because it is not attached to a terminal. This is a meta argument which can
not be used within a configuration script and is only valid from the command line or
during a popen invocation of dgenerate.
---------------------------------------
--console Launch a terminal-like Tkinter GUI that interacts with an instance of dgenerate
running in the background. This allows you to interactively write dgenerate config
scripts as if dgenerate were a shell / REPL. This is a meta argument which can not
be used within a configuration script and is only valid from the command line or
during a popen invocation of dgenerate.
---------------------------------------
--plugin-modules PATH [PATH ...]
Specify one or more plugin module folder paths (folder containing __init__.py) or
Python .py file paths, or Python module names to load as plugins. Plugin modules can
currently implement image processors, config directives, config template functions,
prompt weighters, and sub-commands.
-----------------------------------
--sub-command SUB_COMMAND
Specify the name a sub-command to invoke. dgenerate exposes some extra image
processing functionality through the use of sub-commands. Sub commands essentially
replace the entire set of accepted arguments with those of a sub-command which
implements additional functionality. See --sub-command-help for a list of sub-
commands and help.
------------------
--sub-command-help [SUB_COMMAND ...]
Use this option alone (or with --plugin-modules) and no model specification in order
to list available sub-command names. Calling a sub-command with "--sub-command name
--help" will produce argument help output for that sub-command. When used with
--plugin-modules, sub-commands implemented by the specified plugins will also be
listed.
-------
-ofm, --offline-mode Whether dgenerate should try to download Hugging Face models that do not exist in
the disk cache, or only use what is available in the cache. Referencing a model on
Hugging Face that has not been cached because it was not previously downloaded will
result in a failure when using this option.
-------------------------------------------
--templates-help [VARIABLE_NAME ...]
Print a list of template variables available in the interpreter environment used for
dgenerate config scripts, particularly the variables set after a dgenerate
invocation occurs. When used as a command line option, their values are not
presented, just their names and types. Specifying names will print type information
for those variable names.
-------------------------
--directives-help [DIRECTIVE_NAME ...]
Use this option alone (or with --plugin-modules) and no model specification in order
to list available config directive names. Providing names will print documentation
for the specified directive names. When used with --plugin-modules, directives
implemented by the specified plugins will also be listed.
---------------------------------------------------------
--functions-help [FUNCTION_NAME ...]
Use this option alone (or with --plugin-modules) and no model specification in order
to list available config template function names. Providing names will print
documentation for the specified function names. When used with --plugin-modules,
functions implemented by the specified plugins will also be listed.
-------------------------------------------------------------------
-mt MODEL_TYPE, --model-type MODEL_TYPE
Use when loading different model types. Currently supported: torch, torch-pix2pix,
torch-sdxl, torch-sdxl-pix2pix, torch-upscaler-x2, torch-upscaler-x4, torch-if,
torch-ifs, torch-ifs-img2img, torch-s-cascade, torch-sd3, torch-flux, or torch-flux-
fill. (default: torch)
----------------------
-rev BRANCH, --revision BRANCH
The model revision to use when loading from a Hugging Face repository, (The Git
branch / tag, default is "main")
--------------------------------
-var VARIANT, --variant VARIANT
If specified when loading from a Hugging Face repository or folder, load weights
from "variant" filename, e.g. "pytorch_model..safetensors". Defaults to
automatic selection.
--------------------
-sbf SUBFOLDER, --subfolder SUBFOLDER
Main model subfolder. If specified when loading from a Hugging Face repository or
folder, load weights from the specified subfolder.
--------------------------------------------------
-atk TOKEN, --auth-token TOKEN
Huggingface auth token. Required to download restricted repositories that have
access permissions granted to your Hugging Face account.
--------------------------------------------------------
-bs INTEGER, --batch-size INTEGER
The number of image variations to produce per set of individual diffusion parameters
in one rendering step simultaneously on a single GPU.
When generating animations with a --batch-size greater than one, a separate
animation (with the filename suffix "animation_N") will be written to for each image
in the batch.
If --batch-grid-size is specified when producing an animation then the image grid is
used for the output frames.
During animation rendering each image in the batch will still be written to the
output directory along side the produced animation as either suffixed files or image
grids depending on the options you choose. (Default: 1)
-------------------------------------------------------
-bgs SIZE, --batch-grid-size SIZE
Produce a single image containing a grid of images with the number of COLUMNSxROWS
given to this argument when --batch-size is greater than 1. If not specified with a
--batch-size greater than 1, images will be written individually with an image
number suffix (image_N) in the filename signifying which image in the batch they
are.
----
-te TEXT_ENCODER_URIS [TEXT_ENCODER_URIS ...], --text-encoders TEXT_ENCODER_URIS [TEXT_ENCODER_URIS ...]
Specify Text Encoders for the main model using URIs, main models may use one or more
text encoders depending on the --model-type value and other dgenerate arguments.
See: --text-encoders help for information about what text encoders are needed for
your invocation.
Examples: "CLIPTextModel;model=huggingface/text_encoder",
"CLIPTextModelWithProjection;model=huggingface/text_encoder;revision=main",
"T5EncoderModel;model=text_encoder_folder_on_disk".
For main models which require multiple text encoders, the + symbol may be used to
indicate that a default value should be used for a particular text encoder, for
example: --text-encoders + + huggingface/encoder3. Any trailing text encoders which
are not specified are given their default value.
The value "null" may be used to indicate that a specific text encoder should not be
loaded.
Blob links / single file loads are not supported for Text Encoders.
The "revision" argument specifies the model revision to use for the Text Encoder
when loading from Hugging Face repository, (The Git branch / tag, default is
"main").
The "variant" argument specifies the Text Encoder model variant. If "variant" is
specified when loading from a Hugging Face repository or folder, weights will be
loaded from "variant" filename, e.g. "pytorch_model..safetensors". For this
argument, "variant" defaults to the value of --variant if it is not specified in the
URI.
The "subfolder" argument specifies the UNet model subfolder, if specified when
loading from a Hugging Face repository or folder, weights from the specified
subfolder.
The "dtype" argument specifies the Text Encoder model precision, it defaults to the
value of -t/--dtype and should be one of: auto, bfloat16, float16, or float32.
The "quantize" argument specifies whether or not to use optimum-quanto to quantize
the text encoder weights, and may be passed the values "qint2", "qint4", "qint8",
"qfloat8_e4m3fn", "qfloat8_e4m3fnuz", "qfloat8_e5m2", or "qfloat8" to specify the
quantization datatype, this can be utilized to run Flux models with much less GPU
memory.
If you wish to load weights directly from a path on disk, you must point this
argument at the folder they exist in, which should also contain the config.json file
for the Text Encoder. For example, a downloaded repository folder from Hugging Face.
------------------------------------------------------------------------------------
-te2 TEXT_ENCODER_URIS [TEXT_ENCODER_URIS ...], --text-encoders2 TEXT_ENCODER_URIS [TEXT_ENCODER_URIS ...]
--text-encoders but for the SDXL refiner or Stable Cascade decoder model.
-------------------------------------------------------------------------
-un UNET_URI, --unet UNET_URI
Specify a UNet using a URI.
Examples: "huggingface/unet", "huggingface/unet;revision=main",
"unet_folder_on_disk".
Blob links / single file loads are not supported for UNets.
The "revision" argument specifies the model revision to use for the UNet when
loading from Hugging Face repository, (The Git branch / tag, default is "main").
The "variant" argument specifies the UNet model variant. If "variant" is specified
when loading from a Hugging Face repository or folder, weights will be loaded from
"variant" filename, e.g. "pytorch_model..safetensors. For this argument,
"variant" defaults to the value of --variant if it is not specified in the URI.
The "subfolder" argument specifies the UNet model subfolder, if specified when
loading from a Hugging Face repository or folder, weights from the specified
subfolder.
The "dtype" argument specifies the UNet model precision, it defaults to the value of
-t/--dtype and should be one of: auto, bfloat16, float16, or float32.
If you wish to load weights directly from a path on disk, you must point this
argument at the folder they exist in, which should also contain the config.json file
for the UNet. For example, a downloaded repository folder from Hugging Face.
----------------------------------------------------------------------------
-un2 UNET_URI, --unet2 UNET_URI
Specify a second UNet, this is only valid when using SDXL or Stable Cascade model
types. This UNet will be used for the SDXL refiner, or Stable Cascade decoder model.
------------------------------------------------------------------------------------
-tf TRANSFORMER_URI, --transformer TRANSFORMER_URI
Specify a Stable Diffusion 3 or Flux Transformer model using a URI.
Examples: "huggingface/transformer", "huggingface/transformer;revision=main",
"transformer_folder_on_disk".
Blob links / single file loads are supported for SD3 Transformers.
The "revision" argument specifies the model revision to use for the Transformer when
loading from Hugging Face repository or blob link, (The Git branch / tag, default is
"main").
The "variant" argument specifies the Transformer model variant. If "variant" is
specified when loading from a Hugging Face repository or folder, weights will be
loaded from "variant" filename, e.g. "pytorch_model..safetensors. For this
argument, "variant" defaults to the value of --variant if it is not specified in the
URI.
The "subfolder" argument specifies the Transformer model subfolder, if specified
when loading from a Hugging Face repository or folder, weights from the specified
subfolder.
The "dtype" argument specifies the Transformer model precision, it defaults to the
value of -t/--dtype and should be one of: auto, bfloat16, float16, or float32.
The "quantize" argument specifies whether or not to use optimum-quanto to quantize
the transformer weights, and may be passed the values "qint2", "qint4", "qint8",
"qfloat8_e4m3fn", "qfloat8_e4m3fnuz", "qfloat8_e5m2", or "qfloat8" to specify the
quantization datatype, this can be utilized to run Flux models with much less GPU
memory.
If you wish to load a weights file directly from disk, the simplest way is:
--transformer "transformer.safetensors", or with a dtype
"transformer.safetensors;dtype=float16". All loading arguments except "dtype" and
"quantize" are unused in this case and may produce an error message if used.
If you wish to load a specific weight file from a Hugging Face repository, use the
blob link loading syntax: --transformer
"AutoencoderKL;https://huggingface.co/UserName/repository-
name/blob/main/transformer.safetensors", the "revision" argument may be used with
this syntax.
------------
-vae VAE_URI, --vae VAE_URI
Specify a VAE using a URI, the URI syntax is: "AutoEncoderClass;model=(Hugging Face
repository slug/blob link or file/folder path)".
Examples: "AutoencoderKL;model=vae.pt",
"AsymmetricAutoencoderKL;model=huggingface/vae",
"AutoencoderTiny;model=huggingface/vae",
"ConsistencyDecoderVAE;model=huggingface/vae".
The AutoencoderKL encoder class accepts Hugging Face repository slugs/blob links,
.pt, .pth, .bin, .ckpt, and .safetensors files.
Other encoders can only accept Hugging Face repository slugs/blob links, or a path
to a folder on disk with the model configuration and model file(s).
If an AutoencoderKL VAE model file exists at a URL which serves the file as a raw
download, you may provide an http/https link to it and it will be downloaded to
dgenerates web cache.
Aside from the "model" argument, there are four other optional arguments that can be
specified, these are: "revision", "variant", "subfolder", "dtype".
They can be specified as so in any order, they are not positional: "AutoencoderKL;mo
del=huggingface/vae;revision=main;variant=fp16;subfolder=sub_folder;dtype=float16".
The "revision" argument specifies the model revision to use for the VAE when loading
from Hugging Face repository or blob link, (The Git branch / tag, default is
"main").
The "variant" argument specifies the VAE model variant. If "variant" is specified
when loading from a Hugging Face repository or folder, weights will be loaded from
"variant" filename, e.g. "pytorch_model..safetensors. "variant" in the case
of --vae does not default to the value of --variant to prevent failures during
common use cases.
The "subfolder" argument specifies the VAE model subfolder, if specified when
loading from a Hugging Face repository or folder, weights from the specified
subfolder.
The "dtype" argument specifies the VAE model precision, it defaults to the value of
-t/--dtype and should be one of: auto, bfloat16, float16, or float32.
If you wish to load a weights file directly from disk, the simplest way is: --vae
"AutoencoderKL;my_vae.safetensors", or with a dtype
"AutoencoderKL;my_vae.safetensors;dtype=float16". All loading arguments except
"dtype" are unused in this case and may produce an error message if used.
If you wish to load a specific weight file from a Hugging Face repository, use the
blob link loading syntax: --vae
"AutoencoderKL;https://huggingface.co/UserName/repository-
name/blob/main/vae_model.safetensors", the "revision" argument may be used with this
syntax.
-------
-vt, --vae-tiling Enable VAE tiling. Assists in the generation of large images with lower memory
overhead. The VAE will split the input tensor into tiles to compute decoding and
encoding in several steps. This is useful for saving a large amount of memory and to
allow processing larger images. Note that if you are using --control-nets you may
still run into memory issues generating large images, or with --batch-size greater
than 1.
-------
-vs, --vae-slicing Enable VAE slicing. Assists in the generation of large images with lower memory
overhead. The VAE will split the input tensor in slices to compute decoding in
several steps. This is useful to save some memory, especially when --batch-size is
greater than 1. Note that if you are using --control-nets you may still run into
memory issues generating large images.
--------------------------------------
-lra LORA_URI [LORA_URI ...], --loras LORA_URI [LORA_URI ...]
Specify one or more LoRA models using URIs. These should be a Hugging Face
repository slug, path to model file on disk (for example, a .pt, .pth, .bin, .ckpt,
or .safetensors file), or model folder containing model files.
If a LoRA model file exists at a URL which serves the file as a raw download, you
may provide an http/https link to it and it will be downloaded to dgenerates web
cache.
Hugging Face blob links are not supported, see "subfolder" and "weight-name" below
instead.
Optional arguments can be provided after a LoRA model specification, these are:
"scale", "revision", "subfolder", and "weight-name".
They can be specified as so in any order, they are not positional:
"huggingface/lora;scale=1.0;revision=main;subfolder=repo_subfolder;weight-
name=lora.safetensors".
The "scale" argument indicates the scale factor of the LoRA.
The "revision" argument specifies the model revision to use for the LoRA when
loading from Hugging Face repository, (The Git branch / tag, default is "main").
The "subfolder" argument specifies the LoRA model subfolder, if specified when
loading from a Hugging Face repository or folder, weights from the specified
subfolder.
The "weight-name" argument indicates the name of the weights file to be loaded when
loading from a Hugging Face repository or folder on disk.
If you wish to load a weights file directly from disk, the simplest way is: --loras
"my_lora.safetensors", or with a scale "my_lora.safetensors;scale=1.0", all other
loading arguments are unused in this case and may produce an error message if used.
-----------------------------------------------------------------------------------
-lrfs LORA_FUSE_SCALE, --lora-fuse-scale LORA_FUSE_SCALE
LoRA weights are merged into the main model at this scale. When specifying multiple
LoRA models, they are fused together into one set of weights using their individual
scale values, after which they are fused into the main model at this scale value.
(default: 1.0).
---------------
-ie IMAGE_ENCODER_URI, --image-encoder IMAGE_ENCODER_URI
Specify an Image Encoder using a URI.
Image Encoders are used with --ip-adapters models, and must be specified if none of
the loaded --ip-adapters contain one. An error will be produced in this situation,
which requires you to use this argument.
An image encoder can also be manually specified for Stable Cascade models.
Examples: "huggingface/image_encoder", "huggingface/image_encoder;revision=main",
"image_encoder_folder_on_disk".
Blob links / single file loads are not supported for Image Encoders.
The "revision" argument specifies the model revision to use for the Image Encoder
when loading from Hugging Face repository or blob link, (The Git branch / tag,
default is "main").
The "variant" argument specifies the Image Encoder model variant. If "variant" is
specified when loading from a Hugging Face repository or folder, weights will be
loaded from "variant" filename, e.g. "pytorch_model..safetensors.
Similar to --vae, "variant" does not default to the value of --variant in order to
prevent errors with common use cases. If you specify multiple IP Adapters, they must
all have the same "variant" value or you will receive a usage error.
The "subfolder" argument specifies the Image Encoder model subfolder, if specified
when loading from a Hugging Face repository or folder, weights from the specified
subfolder.
The "dtype" argument specifies the Image Encoder model precision, it defaults to the
value of -t/--dtype and should be one of: auto, bfloat16, float16, or float32.
If you wish to load weights directly from a path on disk, you must point this
argument at the folder they exist in, which should also contain the config.json file
for the Image Encoder. For example, a downloaded repository folder from Hugging
Face.
-----
-ipa IP_ADAPTER_URI [IP_ADAPTER_URI ...], --ip-adapters IP_ADAPTER_URI [IP_ADAPTER_URI ...]
Specify one or more IP Adapter models using URIs. These should be a Hugging Face
repository slug, path to model file on disk (for example, a .pt, .pth, .bin, .ckpt,
or .safetensors file), or model folder containing model files.
If an IP Adapter model file exists at a URL which serves the file as a raw download,
you may provide an http/https link to it and it will be downloaded to dgenerates web
cache.
Hugging Face blob links are not supported, see "subfolder" and "weight-name" below
instead.
Optional arguments can be provided after an IP Adapter model specification, these
are: "scale", "revision", "subfolder", and "weight-name".
They can be specified as so in any order, they are not positional: "huggingface/ip-
adapter;scale=1.0;revision=main;subfolder=repo_subfolder;weight-
name=ip_adapter.safetensors".
The "scale" argument indicates the scale factor of the IP Adapter.
The "revision" argument specifies the model revision to use for the IP Adapter when
loading from Hugging Face repository, (The Git branch / tag, default is "main").
The "subfolder" argument specifies the IP Adapter model subfolder, if specified when
loading from a Hugging Face repository or folder, weights from the specified
subfolder.
The "weight-name" argument indicates the name of the weights file to be loaded when
loading from a Hugging Face repository or folder on disk.
If you wish to load a weights file directly from disk, the simplest way is: --ip-
adapters "ip_adapter.safetensors", or with a scale
"ip_adapter.safetensors;scale=1.0", all other loading arguments are unused in this
case and may produce an error message if used.
----------------------------------------------
-ti URI [URI ...], --textual-inversions URI [URI ...]
Specify one or more Textual Inversion models using URIs. These should be a Hugging
Face repository slug, path to model file on disk (for example, a .pt, .pth, .bin,
.ckpt, or .safetensors file), or model folder containing model files.
If a Textual Inversion model file exists at a URL which serves the file as a raw
download, you may provide an http/https link to it and it will be downloaded to
dgenerates web cache.
Hugging Face blob links are not supported, see "subfolder" and "weight-name" below
instead.
Optional arguments can be provided after the Textual Inversion model specification,
these are: "token", "revision", "subfolder", and "weight-name".
They can be specified as so in any order, they are not positional:
"huggingface/ti_model;revision=main;subfolder=repo_subfolder;weight-
name=ti_model.safetensors".
The "token" argument can be used to override the prompt token used for the textual
inversion prompt embedding. For normal Stable Diffusion the default token value is
provided by the model itself, but for Stable Diffusion XL and Flux the default token
value is equal to the model file name with no extension and all spaces replaced by
underscores.
The "revision" argument specifies the model revision to use for the Textual
Inversion model when loading from Hugging Face repository, (The Git branch / tag,
default is "main").
The "subfolder" argument specifies the Textual Inversion model subfolder, if
specified when loading from a Hugging Face repository or folder, weights from the
specified subfolder.
The "weight-name" argument indicates the name of the weights file to be loaded when
loading from a Hugging Face repository or folder on disk.
If you wish to load a weights file directly from disk, the simplest way is:
--textual-inversions "my_ti_model.safetensors", all other loading arguments are
unused in this case and may produce an error message if used.
-------------------------------------------------------------
-cn CONTROLNET_URI [CONTROLNET_URI ...], --control-nets CONTROLNET_URI [CONTROLNET_URI ...]
Specify one or more ControlNet models using URIs. This should be a Hugging Face
repository slug / blob link, path to model file on disk (for example, a .pt, .pth,
.bin, .ckpt, or .safetensors file), or model folder containing model files.
If a ControlNet model file exists at a URL which serves the file as a raw download,
you may provide an http/https link to it and it will be downloaded to dgenerates web
cache.
Optional arguments can be provided after the ControlNet model specification, these
are: "scale", "start", "end", "revision", "variant", "subfolder", and "dtype".
They can be specified as so in any order, they are not positional: "huggingface/cont
rolnet;scale=1.0;start=0.0;end=1.0;revision=main;variant=fp16;subfolder=repo_subfold
er;dtype=float16".
The "scale" argument specifies the scaling factor applied to the ControlNet model,
the default value is 1.0.
The "start" argument specifies at what fraction of the total inference steps to
begin applying the ControlNet, defaults to 0.0, IE: the very beginning.
The "end" argument specifies at what fraction of the total inference steps to stop
applying the ControlNet, defaults to 1.0, IE: the very end.
The "mode" argument can be used when using --model-type torch-flux and ControlNet
Union to specify the ControlNet mode. Acceptable values are: "canny", "tile",
"depth", "blur", "pose", "gray", "lq". This value may also be an integer between 0
and 6, inclusive.
The "revision" argument specifies the model revision to use for the ControlNet model
when loading from Hugging Face repository, (The Git branch / tag, default is
"main").
The "variant" argument specifies the ControlNet model variant, if "variant" is
specified when loading from a Hugging Face repository or folder, weights will be
loaded from "variant" filename, e.g. "pytorch_model..safetensors. "variant"
defaults to automatic selection. "variant" in the case of --control-nets does not
default to the value of --variant to prevent failures during common use cases.
The "subfolder" argument specifies the ControlNet model subfolder, if specified when
loading from a Hugging Face repository or folder, weights from the specified
subfolder.
The "dtype" argument specifies the ControlNet model precision, it defaults to the
value of -t/--dtype and should be one of: auto, bfloat16, float16, or float32.
If you wish to load a weights file directly from disk, the simplest way is:
--control-nets "my_controlnet.safetensors" or --control-nets
"my_controlnet.safetensors;scale=1.0;dtype=float16", all other loading arguments
aside from "scale", "start", "end", and "dtype" are unused in this case and may
produce an error message if used.
If you wish to load a specific weight file from a Hugging Face repository, use the
blob link loading syntax: --control-nets
"https://huggingface.co/UserName/repository-name/blob/main/controlnet.safetensors",
the "revision" argument may be used with this syntax.
-----------------------------------------------------
-t2i T2I_ADAPTER_URI [T2I_ADAPTER_URI ...], --t2i-adapters T2I_ADAPTER_URI [T2I_ADAPTER_URI ...]
Specify one or more T2IAdapter models using URIs. This should be a Hugging Face
repository slug / blob link, path to model file on disk (for example, a .pt, .pth,
.bin, .ckpt, or .safetensors file), or model folder containing model files.
If a T2IAdapter model file exists at a URL which serves the file as a raw download,
you may provide an http/https link to it and it will be downloaded to dgenerates web
cache.
Optional arguments can be provided after the T2IAdapter model specification, these
are: "scale", "revision", "variant", "subfolder", and "dtype".
They can be specified as so in any order, they are not positional: "huggingface/t2ia
dapter;scale=1.0;revision=main;variant=fp16;subfolder=repo_subfolder;dtype=float16".
The "scale" argument specifies the scaling factor applied to the T2IAdapter model,
the default value is 1.0.
The "revision" argument specifies the model revision to use for the T2IAdapter model
when loading from Hugging Face repository, (The Git branch / tag, default is
"main").
The "variant" argument specifies the T2IAdapter model variant, if "variant" is
specified when loading from a Hugging Face repository or folder, weights will be
loaded from "variant" filename, e.g. "pytorch_model..safetensors. "variant"
defaults to automatic selection. "variant" in the case of --t2i-adapters does not
default to the value of --variant to prevent failures during common use cases.
The "subfolder" argument specifies the ControlNet model subfolder, if specified when
loading from a Hugging Face repository or folder, weights from the specified
subfolder.
The "dtype" argument specifies the T2IAdapter model precision, it defaults to the
value of -t/--dtype and should be one of: auto, bfloat16, float16, or float32.
If you wish to load a weights file directly from disk, the simplest way is:
--t2i-adapters "my_t2i_adapter.safetensors" or --t2i-adapters
"my_t2i_adapter.safetensors;scale=1.0;dtype=float16", all other loading arguments
aside from "scale" and "dtype" are unused in this case and may produce an error
message if used.
If you wish to load a specific weight file from a Hugging Face repository, use the
blob link loading syntax: --t2i-adapters
"https://huggingface.co/UserName/repository-name/blob/main/t2i_adapter.safetensors",
the "revision" argument may be used with this syntax.
-----------------------------------------------------
-sch SCHEDULER_URI [SCHEDULER_URI ...], --scheduler SCHEDULER_URI [SCHEDULER_URI ...], --schedulers SCHEDULER_URI [SCHEDULER_URI ...]
Specify a scheduler (sampler) by URI. Passing "help" to this argument will print the
compatible schedulers for a model without generating any images. Passing "helpargs"
will yield a help message with a list of overridable arguments for each scheduler
and their typical defaults. Arguments listed by "helpargs" can be overridden using
the URI syntax typical to other dgenerate URI arguments.
You may pass multiple scheduler URIs to this argument, each URI will be tried in
turn.
-----
-pag, --pag Use perturbed attention guidance? This is supported for --model-type torch, torch-
sdxl, and torch-sd3 for most use cases. This enables PAG for the main model using
default scale values.
---------------------
-pags FLOAT [FLOAT ...], --pag-scales FLOAT [FLOAT ...]
One or more perturbed attention guidance scales to try. Specifying values enables
PAG for the main model. (default: [3.0])
----------------------------------------
-pagas FLOAT [FLOAT ...], --pag-adaptive-scales FLOAT [FLOAT ...]
One or more adaptive perturbed attention guidance scales to try. Specifying values
enables PAG for the main model. (default: [0.0])
------------------------------------------------
-rpag, --sdxl-refiner-pag
Use perturbed attention guidance in the SDXL refiner? This is supported for --model-
type torch-sdxl for most use cases. This enables PAG for the SDXL refiner model
using default scale values.
---------------------------
-rpags FLOAT [FLOAT ...], --sdxl-refiner-pag-scales FLOAT [FLOAT ...]
One or more perturbed attention guidance scales to try with the SDXL refiner pass.
Specifying values enables PAG for the refiner. (default: [3.0])
---------------------------------------------------------------
-rpagas FLOAT [FLOAT ...], --sdxl-refiner-pag-adaptive-scales FLOAT [FLOAT ...]
One or more adaptive perturbed attention guidance scales to try with the SDXL
refiner pass. Specifying values enables PAG for the refiner. (default: [0.0])
-----------------------------------------------------------------------------
-mqo, --model-sequential-offload
Force sequential model offloading for the main pipeline, this may drastically reduce
memory consumption and allow large models to run when they would otherwise not fit
in your GPUs VRAM. Inference will be much slower. Mutually exclusive with --model-
cpu-offload
-----------
-mco, --model-cpu-offload
Force model cpu offloading for the main pipeline, this may reduce memory consumption
and allow large models to run when they would otherwise not fit in your GPUs VRAM.
Inference will be slower. Mutually exclusive with --model-sequential-offload
----------------------------------------------------------------------------
--s-cascade-decoder MODEL_URI
Specify a Stable Cascade (torch-s-cascade) decoder model path using a URI. This
should be a Hugging Face repository slug / blob link, path to model file on disk
(for example, a .pt, .pth, .bin, .ckpt, or .safetensors file), or model folder
containing model files.
Optional arguments can be provided after the decoder model specification, these are:
"revision", "variant", "subfolder", and "dtype".
They can be specified as so in any order, they are not positional: "huggingface/deco
der_model;revision=main;variant=fp16;subfolder=repo_subfolder;dtype=float16".
The "revision" argument specifies the model revision to use for the decoder model
when loading from Hugging Face repository, (The Git branch / tag, default is
"main").
The "variant" argument specifies the decoder model variant and defaults to the value
of --variant. When "variant" is specified when loading from a Hugging Face
repository or folder, weights will be loaded from "variant" filename, e.g.
"pytorch_model..safetensors.
The "subfolder" argument specifies the decoder model subfolder, if specified when
loading from a Hugging Face repository or folder, weights from the specified
subfolder.
The "dtype" argument specifies the Stable Cascade decoder model precision, it
defaults to the value of -t/--dtype and should be one of: auto, bfloat16, float16,
or float32.
If you wish to load a weights file directly from disk, the simplest way is: --sdxl-
refiner "my_decoder.safetensors" or --sdxl-refiner
"my_decoder.safetensors;dtype=float16", all other loading arguments aside from
"dtype" are unused in this case and may produce an error message if used.
If you wish to load a specific weight file from a Hugging Face repository, use the
blob link loading syntax: --s-cascade-decoder
"https://huggingface.co/UserName/repository-name/blob/main/decoder.safetensors", the
"revision" argument may be used with this syntax.
-------------------------------------------------
-dqo, --s-cascade-decoder-sequential-offload
Force sequential model offloading for the Stable Cascade decoder pipeline, this may
drastically reduce memory consumption and allow large models to run when they would
otherwise not fit in your GPUs VRAM. Inference will be much slower. Mutually
exclusive with --s-cascade-decoder-cpu-offload
----------------------------------------------
-dco, --s-cascade-decoder-cpu-offload
Force model cpu offloading for the Stable Cascade decoder pipeline, this may reduce
memory consumption and allow large models to run when they would otherwise not fit
in your GPUs VRAM. Inference will be slower. Mutually exclusive with --s-cascade-
decoder-sequential-offload
--------------------------
--s-cascade-decoder-prompts PROMPT [PROMPT ...]
One or more prompts to try with the Stable Cascade decoder model, by default the
decoder model gets the primary prompt, this argument overrides that with a prompt of
your choosing. The negative prompt component can be specified with the same syntax
as --prompts
------------
--s-cascade-decoder-inference-steps INTEGER [INTEGER ...]
One or more inference steps values to try with the Stable Cascade decoder. (default:
[10])
-----
--s-cascade-decoder-guidance-scales INTEGER [INTEGER ...]
One or more guidance scale values to try with the Stable Cascade decoder. (default:
[0])
----
--s-cascade-decoder-scheduler SCHEDULER_URI [SCHEDULER_URI ...], --s-cascade-decoder-schedulers SCHEDULER_URI [SCHEDULER_URI ...]
Specify a scheduler (sampler) by URI for the Stable Cascade decoder pass. Operates
the exact same way as --scheduler including the "help" option. Passing 'helpargs'
will yield a help message with a list of overridable arguments for each scheduler
and their typical defaults. Defaults to the value of --scheduler.
You may pass multiple scheduler URIs to this argument, each URI will be tried in
turn.
-----
--sdxl-refiner MODEL_URI
Specify a Stable Diffusion XL (torch-sdxl) refiner model path using a URI. This
should be a Hugging Face repository slug / blob link, path to model file on disk
(for example, a .pt, .pth, .bin, .ckpt, or .safetensors file), or model folder
containing model files.
Optional arguments can be provided after the SDXL refiner model specification, these
are: "revision", "variant", "subfolder", and "dtype".
They can be specified as so in any order, they are not positional: "huggingface/refi
ner_model_xl;revision=main;variant=fp16;subfolder=repo_subfolder;dtype=float16".
The "revision" argument specifies the model revision to use for the refiner model
when loading from Hugging Face repository, (The Git branch / tag, default is
"main").
The "variant" argument specifies the SDXL refiner model variant and defaults to the
value of --variant. When "variant" is specified when loading from a Hugging Face
repository or folder, weights will be loaded from "variant" filename, e.g.
"pytorch_model..safetensors.
The "subfolder" argument specifies the SDXL refiner model subfolder, if specified
when loading from a Hugging Face repository or folder, weights from the specified
subfolder.
The "dtype" argument specifies the SDXL refiner model precision, it defaults to the
value of -t/--dtype and should be one of: auto, bfloat16, float16, or float32.
If you wish to load a weights file directly from disk, the simplest way is: --sdxl-
refiner "my_sdxl_refiner.safetensors" or --sdxl-refiner
"my_sdxl_refiner.safetensors;dtype=float16", all other loading arguments aside from
"dtype" are unused in this case and may produce an error message if used.
If you wish to load a specific weight file from a Hugging Face repository, use the
blob link loading syntax: --sdxl-refiner
"https://huggingface.co/UserName/repository-
name/blob/main/refiner_model.safetensors", the "revision" argument may be used with
this syntax.
------------
-rqo, --sdxl-refiner-sequential-offload
Force sequential model offloading for the SDXL refiner pipeline, this may
drastically reduce memory consumption and allow large models to run when they would
otherwise not fit in your GPUs VRAM. Inference will be much slower. Mutually
exclusive with --refiner-cpu-offload
------------------------------------
-rco, --sdxl-refiner-cpu-offload
Force model cpu offloading for the SDXL refiner pipeline, this may reduce memory
consumption and allow large models to run when they would otherwise not fit in your
GPUs VRAM. Inference will be slower. Mutually exclusive with --refiner-sequential-
offload
-------
--sdxl-refiner-scheduler SCHEDULER_URI [SCHEDULER_URI ...], --sdxl-refiner-schedulers SCHEDULER_URI [SCHEDULER_URI ...]
Specify a scheduler (sampler) by URI for the SDXL refiner pass. Operates the exact
same way as --scheduler including the "help" option. Passing 'helpargs' will yield a
help message with a list of overridable arguments for each scheduler and their
typical defaults. Defaults to the value of --scheduler.
You may pass multiple scheduler URIs to this argument, each URI will be tried in
turn.
-----
--sdxl-refiner-edit Force the SDXL refiner to operate in edit mode instead of cooperative denoising mode
as it would normally do for inpainting and ControlNet usage. The main model will
perform the full amount of inference steps requested by --inference-steps. The
output of the main model will be passed to the refiner model and processed with an
image seed strength in img2img mode determined by (1.0 - high-noise-fraction)
-----------------------------------------------------------------------------
--sdxl-second-prompts PROMPT [PROMPT ...]
One or more secondary prompts to try using SDXL's secondary text encoder. By default
the model is passed the primary prompt for this value, this option allows you to
choose a different prompt. The negative prompt component can be specified with the
same syntax as --prompts
------------------------
--sdxl-t2i-adapter-factors FLOAT [FLOAT ...]
One or more SDXL specific T2I adapter factors to try, this controls the amount of
time-steps for which a T2I adapter applies guidance to an image, this is a value
between 0.0 and 1.0. A value of 0.5 for example indicates that the T2I adapter is
only active for half the amount of time-steps it takes to completely render an
image.
------
--sdxl-aesthetic-scores FLOAT [FLOAT ...]
One or more Stable Diffusion XL (torch-sdxl) "aesthetic-score" micro-conditioning
parameters. Used to simulate an aesthetic score of the generated image by
influencing the positive text condition. Part of SDXL's micro-conditioning as
explained in section 2.2 of [https://huggingface.co/papers/2307.01952].
-----------------------------------------------------------------------
--sdxl-crops-coords-top-left COORD [COORD ...]
One or more Stable Diffusion XL (torch-sdxl) "negative-crops-coords-top-left" micro-
conditioning parameters in the format "0,0". --sdxl-crops-coords-top-left can be
used to generate an image that appears to be "cropped" from the position --sdxl-
crops-coords-top-left downwards. Favorable, well-centered images are usually
achieved by setting --sdxl-crops-coords-top-left to "0,0". Part of SDXL's micro-
conditioning as explained in section 2.2 of
[https://huggingface.co/papers/2307.01952].
-------------------------------------------
--sdxl-original-size SIZE [SIZE ...], --sdxl-original-sizes SIZE [SIZE ...]
One or more Stable Diffusion XL (torch-sdxl) "original-size" micro-conditioning
parameters in the format (WIDTH)x(HEIGHT). If not the same as --sdxl-target-size the
image will appear to be down or up-sampled. --sdxl-original-size defaults to
--output-size or the size of any input images if not specified. Part of SDXL's
micro-conditioning as explained in section 2.2 of
[https://huggingface.co/papers/2307.01952]
------------------------------------------
--sdxl-target-size SIZE [SIZE ...], --sdxl-target-sizes SIZE [SIZE ...]
One or more Stable Diffusion XL (torch-sdxl) "target-size" micro-conditioning
parameters in the format (WIDTH)x(HEIGHT). For most cases, --sdxl-target-size should
be set to the desired height and width of the generated image. If not specified it
will default to --output-size or the size of any input images. Part of SDXL's micro-
conditioning as explained in section 2.2 of
[https://huggingface.co/papers/2307.01952]
------------------------------------------
--sdxl-negative-aesthetic-scores FLOAT [FLOAT ...]
One or more Stable Diffusion XL (torch-sdxl) "negative-aesthetic-score" micro-
conditioning parameters. Part of SDXL's micro-conditioning as explained in section
2.2 of [https://huggingface.co/papers/2307.01952]. Can be used to simulate an
aesthetic score of the generated image by influencing the negative text condition.
----------------------------------------------------------------------------------
--sdxl-negative-original-sizes SIZE [SIZE ...]
One or more Stable Diffusion XL (torch-sdxl) "negative-original-sizes" micro-
conditioning parameters. Negatively condition the generation process based on a
specific image resolution. Part of SDXL's micro-conditioning as explained in section
2.2 of [https://huggingface.co/papers/2307.01952]. For more information, refer to
this issue thread: https://github.com/huggingface/diffusers/issues/4208
-----------------------------------------------------------------------
--sdxl-negative-target-sizes SIZE [SIZE ...]
One or more Stable Diffusion XL (torch-sdxl) "negative-original-sizes" micro-
conditioning parameters. To negatively condition the generation process based on a
target image resolution. It should be as same as the "--sdxl-target-size" for most
cases. Part of SDXL's micro-conditioning as explained in section 2.2 of
[https://huggingface.co/papers/2307.01952]. For more information, refer to this
issue thread: https://github.com/huggingface/diffusers/issues/4208.
-------------------------------------------------------------------
--sdxl-negative-crops-coords-top-left COORD [COORD ...]
One or more Stable Diffusion XL (torch-sdxl) "negative-crops-coords-top-left" micro-
conditioning parameters in the format "0,0". Negatively condition the generation
process based on a specific crop coordinates. Part of SDXL's micro-conditioning as
explained in section 2.2 of [https://huggingface.co/papers/2307.01952]. For more
information, refer to this issue thread:
https://github.com/huggingface/diffusers/issues/4208.
-----------------------------------------------------
--sdxl-refiner-prompts PROMPT [PROMPT ...]
One or more prompts to try with the SDXL refiner model, by default the refiner model
gets the primary prompt, this argument overrides that with a prompt of your
choosing. The negative prompt component can be specified with the same syntax as
--prompts
---------
--sdxl-refiner-clip-skips INTEGER [INTEGER ...]
One or more clip skip override values to try for the SDXL refiner, which normally
uses the clip skip value for the main model when it is defined by --clip-skips.
-------------------------------------------------------------------------------
--sdxl-refiner-second-prompts PROMPT [PROMPT ...]
One or more prompts to try with the SDXL refiner models secondary text encoder, by
default the refiner model gets the primary prompt passed to its second text encoder,
this argument overrides that with a prompt of your choosing. The negative prompt
component can be specified with the same syntax as --prompts
------------------------------------------------------------
--sdxl-refiner-aesthetic-scores FLOAT [FLOAT ...]
See: --sdxl-aesthetic-scores, applied to SDXL refiner pass.
-----------------------------------------------------------
--sdxl-refiner-crops-coords-top-left COORD [COORD ...]
See: --sdxl-crops-coords-top-left, applied to SDXL refiner pass.
----------------------------------------------------------------
--sdxl-refiner-original-sizes SIZE [SIZE ...]
See: --sdxl-refiner-original-sizes, applied to SDXL refiner pass.
-----------------------------------------------------------------
--sdxl-refiner-target-sizes SIZE [SIZE ...]
See: --sdxl-refiner-target-sizes, applied to SDXL refiner pass.
---------------------------------------------------------------
--sdxl-refiner-negative-aesthetic-scores FLOAT [FLOAT ...]
See: --sdxl-negative-aesthetic-scores, applied to SDXL refiner pass.
--------------------------------------------------------------------
--sdxl-refiner-negative-original-sizes SIZE [SIZE ...]
See: --sdxl-negative-original-sizes, applied to SDXL refiner pass.
------------------------------------------------------------------
--sdxl-refiner-negative-target-sizes SIZE [SIZE ...]
See: --sdxl-negative-target-sizes, applied to SDXL refiner pass.
----------------------------------------------------------------
--sdxl-refiner-negative-crops-coords-top-left COORD [COORD ...]
See: --sdxl-negative-crops-coords-top-left, applied to SDXL refiner pass.
-------------------------------------------------------------------------
-hnf FLOAT [FLOAT ...], --sdxl-high-noise-fractions FLOAT [FLOAT ...]
One or more high-noise-fraction values for Stable Diffusion XL (torch-sdxl), this
fraction of inference steps will be processed by the base model, while the rest will
be processed by the refiner model. Multiple values to this argument will result in
additional generation steps for each value. In certain situations when the mixture
of denoisers algorithm is not supported, such as when using --control-nets and
inpainting with SDXL, the inverse proportion of this value IE: (1.0 - high-noise-
fraction) becomes the --image-seed-strengths input to the SDXL refiner. (default:
[0.8])
------
-ri INT [INT ...], --sdxl-refiner-inference-steps INT [INT ...]
One or more inference steps values for the SDXL refiner when in use. Override the
number of inference steps used by the SDXL refiner, which defaults to the value
taken from --inference-steps.
-----------------------------
-rg FLOAT [FLOAT ...], --sdxl-refiner-guidance-scales FLOAT [FLOAT ...]
One or more guidance scale values for the SDXL refiner when in use. Override the
guidance scale value used by the SDXL refiner, which defaults to the value taken
from --guidance-scales.
-----------------------
-rgr FLOAT [FLOAT ...], --sdxl-refiner-guidance-rescales FLOAT [FLOAT ...]
One or more guidance rescale values for the SDXL refiner when in use. Override the
guidance rescale value used by the SDXL refiner, which defaults to the value taken
from --guidance-rescales.
-------------------------
-sc, --safety-checker
Enable safety checker loading, this is off by default. When turned on images with
NSFW content detected may result in solid black output. Some pretrained models have
no safety checker model present, in that case this option has no effect.
------------------------------------------------------------------------
-d DEVICE, --device DEVICE
cuda / cpu, or other device supported by torch, for example mps on MacOS. (default:
cuda, mps on MacOS). Use: cuda:0, cuda:1, cuda:2, etc. to specify a specific cuda
supporting GPU.
---------------
-t DTYPE, --dtype DTYPE
Model precision: auto, bfloat16, float16, or float32. (default: auto)
---------------------------------------------------------------------
-s SIZE, --output-size SIZE
Image output size, for txt2img generation this is the exact output size. The
dimensions specified for this value must be aligned by 8 or you will receive an
error message. If an --image-seeds URI is used its Seed, Mask, and/or Control
component image sources will be resized to this dimension with aspect ratio
maintained before being used for generation by default, except in the case of Stable
Cascade where the images are used as a style prompt (not a noised seed), and can be
of varying dimensions.
If --no-aspect is not specified, width will be fixed and a new height (aligned by 8)
will be calculated for the input images. In most cases resizing the image inputs
will result in an image output of an equal size to the inputs, except for upscalers
and Deep Floyd --model-type values (torch-if*).
If only one integer value is provided, that is the value for both dimensions. X/Y
dimension values should be separated by "x".
This value defaults to 512x512 for Stable Diffusion when no --image-seeds are
specified (IE txt2img mode), 1024x1024 for Stable Cascade and Stable Diffusion 3/XL
or Flux model types, and 64x64 for --model-type torch-if (Deep Floyd stage 1).
Deep Floyd stage 1 images passed to superscaler models (--model-type torch-ifs*)
that are specified with the 'floyd' keyword argument in an --image-seeds definition
are never resized or processed in any way.
------------------------------------------
-na, --no-aspect This option disables aspect correct resizing of images provided to --image-seeds
globally. Seed, Mask, and Control guidance images will be resized to the closest
dimension specified by --output-size that is aligned by 8 pixels with no
consideration of the source aspect ratio. This can be overriden at the --image-seeds
level with the image seed keyword argument 'aspect=true/false'.
---------------------------------------------------------------
-o PATH, --output-path PATH
Output path for generated images and files. This directory will be created if it
does not exist. (default: ./output)
-----------------------------------
-op PREFIX, --output-prefix PREFIX
Name prefix for generated images and files. This prefix will be added to the
beginning of every generated file, followed by an underscore.
-------------------------------------------------------------
-ox, --output-overwrite
Enable overwrites of files in the output directory that already exists. The default
behavior is not to do this, and instead append a filename suffix:
"_duplicate_(number)" when it is detected that the generated file name already
exists.
-------
-oc, --output-configs
Write a configuration text file for every output image or animation. The text file
can be used reproduce that particular output image or animation by piping it to
dgenerate STDIN or by using the --file option, for example "dgenerate < config.dgen"
or "dgenerate --file config.dgen". These files will be written to --output-path and
are affected by --output-prefix and --output-overwrite as well. The files will be
named after their corresponding image or animation file. Configuration files
produced for animation frame images will utilize --frame-start and --frame-end to
specify the frame number.
-------------------------
-om, --output-metadata
Write the information produced by --output-configs to the PNG metadata of each
image. Metadata will not be written to animated files (yet). The data is written to
a PNG metadata property named DgenerateConfig and can be read using ImageMagick like
so: "magick identify -format "%[Property:DgenerateConfig] generated_file.png".
------------------------------------------------------------------------------
-pw PROMPT_WEIGHTER_URI, --prompt-weighter PROMPT_WEIGHTER_URI
Specify a prompt weighter implementation by URI, for example: --prompt-weighter
compel, or --prompt-weighter sd-embed. By default, no prompt weighting syntax is
enabled, meaning that you cannot adjust token weights as you may be able to do in
software such as ComfyUI, Automatic1111, CivitAI etc. And in some cases the length
of your prompt is limited. Prompt weighters support these special token weighting
syntaxes and long prompts, currently there are two implementations "compel" and "sd-
embed". See: --prompt-weighter-help for a list of implementation names. You may also
use --prompt-weighter-help "name" to see comprehensive documentation for a specific
prompt weighter implementation.
-------------------------------
--prompt-weighter-help [PROMPT_WEIGHTER_NAMES ...]
Use this option alone (or with --plugin-modules) and no model specification in order
to list available prompt weighter names. Specifying one or more prompt weighter
names after this option will cause usage documentation for the specified prompt
weighters to be printed. When used with --plugin-modules, prompt weighters
implemented by the specified plugins will also be listed.
---------------------------------------------------------
-p PROMPT [PROMPT ...], --prompts PROMPT [PROMPT ...]
One or more prompts to try, an image group is generated for each prompt, prompt data
is split by ; (semi-colon). The first value is the positive text influence, things
you want to see. The Second value is negative influence IE. things you don't want to
see. Example: --prompts "photo of a horse in a field; artwork, painting, rain".
(default: [(empty string)])
---------------------------
--sd3-max-sequence-length INTEGER
The maximum amount of prompt tokens that the T5EncoderModel (third text encoder) of
Stable Diffusion 3 can handle. This should be an integer value between 1 and 512
inclusive. The higher the value the more resources and time are required for
processing. (default: 256)
--------------------------
--sd3-second-prompts PROMPT [PROMPT ...]
One or more secondary prompts to try using the torch-sd3 (Stable Diffusion 3)
secondary text encoder. By default the model is passed the primary prompt for this
value, this option allows you to choose a different prompt. The negative prompt
component can be specified with the same syntax as --prompts
------------------------------------------------------------
--sd3-third-prompts PROMPT [PROMPT ...]
One or more tertiary prompts to try using the torch-sd3 (Stable Diffusion 3)
tertiary (T5) text encoder. By default the model is passed the primary prompt for
this value, this option allows you to choose a different prompt. The negative prompt
component can be specified with the same syntax as --prompts
------------------------------------------------------------
--flux-second-prompts PROMPT [PROMPT ...]
One or more secondary prompts to try using the torch-flux (Flux) secondary (T5) text
encoder. By default the model is passed the primary prompt for this value, this
option allows you to choose a different prompt.
-----------------------------------------------
--flux-max-sequence-length INTEGER
The maximum amount of prompt tokens that the T5EncoderModel (second text encoder) of
Flux can handle. This should be an integer value between 1 and 512 inclusive. The
higher the value the more resources and time are required for processing. (default:
512)
----
-cs INTEGER [INTEGER ...], --clip-skips INTEGER [INTEGER ...]
One or more clip skip values to try. Clip skip is the number of layers to be skipped
from CLIP while computing the prompt embeddings, it must be a value greater than or
equal to zero. A value of 1 means that the output of the pre-final layer will be
used for computing the prompt embeddings. This is only supported for --model-type
values "torch", "torch-sdxl", and "torch-sd3".
----------------------------------------------
-se SEED [SEED ...], --seeds SEED [SEED ...]
One or more seeds to try, define fixed seeds to achieve deterministic output. This
argument may not be used when --gse/--gen-seeds is used. (default: [randint(0,
99999999999999)])
-----------------
-sei, --seeds-to-images
When this option is enabled, each provided --seeds value or value generated by
--gen-seeds is used for the corresponding image input given by --image-seeds. If the
amount of --seeds given is not identical to that of the amount of --image-seeds
given, the seed is determined as: seed = seeds[image_seed_index % len(seeds)], IE:
it wraps around.
----------------
-gse COUNT, --gen-seeds COUNT
Auto generate N random seeds to try. This argument may not be used when -se/--seeds
is used.
--------
-af FORMAT, --animation-format FORMAT
Output format when generating an animation from an input video / gif / webp etc.
Value must be one of: mp4, png, apng, gif, or webp. You may also specify "frames" to
indicate that only frames should be output and no coalesced animation file should be
rendered. (default: mp4)
------------------------
-if FORMAT, --image-format FORMAT
Output format when writing static images. Any selection other than "png" is not
compatible with --output-metadata. Value must be one of: png, apng, blp, bmp, dib,
bufr, pcx, dds, ps, eps, gif, grib, h5, hdf, jp2, j2k, jpc, jpf, jpx, j2c, icns,
ico, im, jfif, jpe, jpg, jpeg, tif, tiff, mpo, msp, palm, pdf, pbm, pgm, ppm, pnm,
pfm, bw, rgb, rgba, sgi, tga, icb, vda, vst, webp, wmf, emf, or xbm. (default: png)
-----------------------------------------------------------------------------------
-nf, --no-frames Do not write frame images individually when rendering an animation, only write the
animation file. This option is incompatible with --animation-format frames.
---------------------------------------------------------------------------
-fs FRAME_NUMBER, --frame-start FRAME_NUMBER
Starting frame slice point for animated files (zero-indexed), the specified frame
will be included. (default: 0)
------------------------------
-fe FRAME_NUMBER, --frame-end FRAME_NUMBER
Ending frame slice point for animated files (zero-indexed), the specified frame will
be included.
------------
-is SEED [SEED ...], --image-seeds SEED [SEED ...]
One or more image seed URIs to process, these may consist of URLs or file paths.
Videos / GIFs / WEBP files will result in frames being rendered as well as an
animated output file being generated if more than one frame is available in the
input file. Inpainting for static images can be achieved by specifying a black and
white mask image in each image seed string using a semicolon as the separating
character, like so: "my-seed-image.png;my-image-mask.png", white areas of the mask
indicate where generated content is to be placed in your seed image.
Output dimensions specific to the image seed can be specified by placing the
dimension at the end of the string following a semicolon like so: "my-seed-
image.png;512x512" or "my-seed-image.png;my-image-mask.png;512x512". When using
--control-nets, a singular image specification is interpreted as the control
guidance image, and you can specify multiple control image sources by separating
them with commas in the case where multiple ControlNets are specified, IE: (--image-
seeds "control-image1.png, control-image2.png") OR (--image-seeds
"seed.png;control=control-image1.png, control-image2.png").
Using --control-nets with img2img or inpainting can be accomplished with the syntax:
"my-seed-image.png;mask=my-image-mask.png;control=my-control-
image.png;resize=512x512". The "mask" and "resize" arguments are optional when using
--control-nets. Videos, GIFs, and WEBP are also supported as inputs when using
--control-nets, even for the "control" argument.
--image-seeds is capable of reading from multiple animated files at once or any
combination of animated files and images, the animated file with the least amount of
frames dictates how many frames are generated and static images are duplicated over
th