https://github.com/ryanontheinside/comfyui_david
Wrapper for Microsoft's DAViD, providing depth and normal estimation, as well as foreground segmentation.
https://github.com/ryanontheinside/comfyui_david
Last synced: 11 months ago
JSON representation
Wrapper for Microsoft's DAViD, providing depth and normal estimation, as well as foreground segmentation.
- Host: GitHub
- URL: https://github.com/ryanontheinside/comfyui_david
- Owner: ryanontheinside
- Created: 2025-07-27T14:40:45.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2025-07-27T14:52:01.000Z (11 months ago)
- Last Synced: 2025-07-27T16:44:29.172Z (11 months ago)
- Language: Python
- Size: 17.6 KB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# ComfyUI DAViD Custom Node
ComfyUI wrapper for [DAViD (Data-efficient and Accurate Vision Models from Synthetic Data)](https://microsoft.github.io/DAViD) - human-centric computer vision models for depth estimation, foreground segmentation, and surface normal estimation.
## Features
- **Automatic Model Downloads**: Models are automatically downloaded to `models/DAViD/` when first used
- **Multiple Model Options**: Support for both Base (ViT-B/16) and Large (ViT-L/16) variants
- **Efficient Multi-Task Processing**: Single model can perform all three tasks simultaneously
- **ComfyUI Native Outputs**: Proper IMAGE and MASK tensor formats for seamless workflow integration
## Available Nodes
### DAViD Model Loader
Loads and manages DAViD ONNX models with automatic downloading.
**Inputs:**
- `model_type`: Dropdown selection from available models
- `multitask_large` (recommended) - Performs all tasks simultaneously
- `depth_base` / `depth_large` - Depth estimation only
- `foreground_base` / `foreground_large` - Foreground segmentation only
- `normal_base` / `normal_large` - Surface normal estimation only
**Outputs:**
- `model`: DAVID_MODEL type for use with inference nodes
### DAViD Multi-Task Estimator
Performs all three tasks simultaneously using the multi-task model (most efficient).
**Inputs:**
- `image`: IMAGE - Input image to process
- `model`: DAVID_MODEL - Multi-task model from loader
**Outputs:**
- `depth_map`: IMAGE - Relative depth map visualization
- `foreground_mask`: MASK - Human silhouette segmentation
- `normal_map`: IMAGE - Surface normal map visualization
### Individual Task Nodes
#### DAViD Depth Estimator
**Outputs:** `depth_map` (IMAGE)
#### DAViD Foreground Segmenter
**Outputs:** `foreground_mask` (MASK)
#### DAViD Normal Estimator
**Outputs:** `normal_map` (IMAGE)
## Usage Example
1. Add **DAViD Model Loader** node, select `multitask_large`
2. Add **DAViD Multi-Task Estimator** node
3. Connect your input IMAGE to the estimator
4. Connect the model output from loader to estimator
5. Use the three outputs (depth, mask, normals) in your workflow
## Technical Details
- **Input Resolution**: Models expect 384x384 input (automatically handled)
- **Output Format**:
- IMAGE tensors: (1, H, W, 3) in range [0, 1]
- MASK tensors: (1, H, W) in range [0, 1]
- **Model Storage**: Downloaded to `{ComfyUI}/models/DAViD/`
- **Performance**: Multi-task model is most efficient for getting all outputs
## Dependencies
Required packages (automatically installed with DAViD):
- numpy
- onnx
- onnxruntime-gpu
- opencv-python
- torch (from ComfyUI)
## Model Information
All models are trained on synthetic human data and optimized for human-centric scenes. Models are provided under MIT License by Microsoft Research.
For more details, see the [DAViD paper](https://arxiv.org/abs/2507.15365) and [official repository](https://github.com/microsoft/DAViD).