https://github.com/if-ai/ComfyUI-IF_DatasetMkr
Create Video datasets to train your video models use YT or a video file path
https://github.com/if-ai/ComfyUI-IF_DatasetMkr
Last synced: about 1 month ago
JSON representation
Create Video datasets to train your video models use YT or a video file path
- Host: GitHub
- URL: https://github.com/if-ai/ComfyUI-IF_DatasetMkr
- Owner: if-ai
- Created: 2025-03-08T17:12:06.000Z (about 1 month ago)
- Default Branch: main
- Last Pushed: 2025-03-08T18:04:30.000Z (about 1 month ago)
- Last Synced: 2025-03-08T18:25:10.477Z (about 1 month ago)
- Language: Python
- Size: 42 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome-comfyui - **IF_DatasetMkr**
README
# ComfyUI-IF_VideoDatasetMaker
An advanced ComfyUI node that converts video files or YouTube links into well-structured training datasets for AI image generation models.
This tool is based on the work done by [zsxkib](https://github.com/zsxkib) in their repository [cog-create-video-dataset](https://github.com/zsxkib/cog-create-video-dataset).## Overview
The **IF_VideoDatasetMaker** node allows you to:
- Download YouTube videos or use local video files
- Automatically segment videos into high-quality clips
- Generate intelligent captions using Qwen-VL models that describe the visual content
- Create a training-ready dataset with the proper file structure
- Customize trigger words and captions to fit your specific needsPerfect for creating training datasets for HyperNetworks, LoRAs, Dreambooth or other fine-tuning approaches.
## Features
- **Multi-source Input**: Process YouTube links or local video files
- **Intelligent Scene Detection**: Automatically extract the best quality clips based on content changes
- **AI-powered Captioning**: Generate detailed, useful captions with multimodal AI
- **Customizable Output**: Control caption prefixes, suffixes, and triggers
- **Structured Organization**: Creates proper dataset structure for immediate training use
- **Automatic Compression**: Generates a ready-to-share ZIP file of your dataset
- **Debugging Options**: Additional debug information to help troubleshoot issues## Installation
### Prerequisites
- ComfyUI installed and working
- Python 3.8+ environment
- FFmpeg installed on your system (required for video processing)### Installing the Node
1. Clone this repository to your ComfyUI custom_nodes directory:
```bash
cd ./custom_nodes
git clone https://github.com/if-ai/ComfyUI-IF_DatasetMkr.git
```2. Install the required dependencies:
```bash
cd ./ComfyUI-IF_DatasetMkr.
pip install -r requirements.txt
```If you want to use AWQ to save VRAM and up to 3x faster inference
you need to install triton and autoawq```
pip install triton
pip install --no-deps --no-build-isolation autoawq
```3. Restart ComfyUI
## Dependencies
This node requires several packages for video processing, AI captioning, and file management:
- yt-dlp - For YouTube video downloading
- scenedetect - For intelligent clip extraction
- python-slugify - For filename normalization
- moviepy - For video editing and frame extraction
- opencv-python - For image processing
- qwen-vl-utils - For multimodal AI caption generation
- ffmpeg-python - For advanced video processing
- Hugging Face Transformers - For running the Qwen-VL models## Usage
1. Add the **IF_VideoDatasetMkr** node to your ComfyUI workflow
2. Configure the node with your desired settings:
- Provide a YouTube URL or local video file path
- Choose your captioning model
- Set custom trigger words if needed
- Select output directory and options
3. Run the workflow to generate your dataset
4. Use the resulting dataset ZIP for training your models## Node Parameters
| Parameter | Description |
|-----------|-------------|
| video_url | YouTube/video URL to process |
| video_file | Local video file path to process |
| trigger_word | Custom trigger word (optional) |
| autocaption | Enable/disable AI captioning |
| custom_caption | Static caption to use for all frames |
| autocaption_prefix | Text to add before all generated captions |
| autocaption_suffix | Text to add after all generated captions |
| output_dir | Custom output directory (defaults to ComfyUI output folder) |
| model_variant | Qwen-VL model to use for captioning |
| model_offload | Toggle CPU offloading to save VRAM |
| hf_token | Hugging Face token for downloading models |
| profile | Captioning profile/persona to use |
| image_size | Resolution for processing frames |
| debug_mode | Enable additional debugging information |## Output
The node generates a structured dataset with:
- A `videos` folder containing all extracted clips
- A `captions` folder with text files matching each clip's name
- Caption files following the naming convention: `{trigger_word}_{clip_number}.txt`
- A compressed ZIP archive for easy sharing## Customization
### Caption Profiles
The node comes with built-in caption profiles that control the style and content of generated descriptions. You can edit these profiles or create your own by modifying the `profiles.json` file.
### Trigger Words
For training specialized models, you can set custom trigger words that will be included in the dataset. These can be used later to activate your trained model.
## Troubleshooting
- **Video Download Issues**: Ensure yt-dlp is up to date and the URL is valid
- **FFmpeg Errors**: Make sure FFmpeg is installed on your system and in your PATH
- **Caption Generation Errors**: Check that you have enough VRAM or try a smaller model
- **Missing Clips**: Try enabling debug mode to see detailed processing information## Requirements
This is a comprehensive list of requirements for the node:
```
# Core dependencies
torch>=2.0.0
pillow>=10.0.0
numpy>=1.24.0
huggingface_hub>=0.26.0# AutoAWQ with specific version - MUST be installed before transformers
autoawq==0.2.8
flash-attn>=2.0.0;platform_system!="Darwin" # Optional for performance, excluded on MacOS# Transformers - MUST be installed after AutoAWQ
transformers>=4.49.0
accelerate>=0.21.0# Qwen model dependencies
tokenizers>=0.15.0
safetensors>=0.3.1
qwen-vl-utils[decord]>=0.0.8# Video processing
opencv-python>=4.8.0
decord>=0.6.0
ffmpeg-python>=0.2.0
imageio_ffmpeg>=0.6.0
moviepy>=2.1.2
scenedetect>=0.6.2# Downloading
yt-dlp>=2023.3.4# Utilities
tqdm>=4.66.1
requests>=2.31.0
python-slugify>=8.0.1
psutil>=5.9.0
packaging>=23.1
aiohttp>=3.8.5
dotenv-python>=0.0.1
```## Support
If you find this tool useful, please consider supporting my work by:
* Starring the repository on GitHub: [ComfyUI-IF_VideoPrompts](https://github.com/yourusername/ComfyUI-IF_VideoPrompts)
* Subscribing to my YouTube channel: [Impact Frames](https://youtube.com/c/impactframes)
* Follow me on X: [Impact Frames X](https://x.com/impactframes)Thank You!
## License
This project is licensed under the MIT License - see the LICENSE file for details.