https://github.com/PaddlePaddle/PaddleMIX
Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.
https://github.com/PaddlePaddle/PaddleMIX
aigc clip controlnet deepseek-vl dit eva-clip got-ocr20 image-to-text internvl2 llava minicpm-v multimodal ppdiffusers qwen2-vl sd-xl sora stable-diffusion stablevideodiffusion text-to-image text-to-video
Last synced: 5 months ago
JSON representation
Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.
- Host: GitHub
- URL: https://github.com/PaddlePaddle/PaddleMIX
- Owner: PaddlePaddle
- License: apache-2.0
- Created: 2023-07-05T03:30:12.000Z (almost 3 years ago)
- Default Branch: develop
- Last Pushed: 2025-10-28T13:10:22.000Z (8 months ago)
- Last Synced: 2025-10-28T15:12:23.291Z (8 months ago)
- Topics: aigc, clip, controlnet, deepseek-vl, dit, eva-clip, got-ocr20, image-to-text, internvl2, llava, minicpm-v, multimodal, ppdiffusers, qwen2-vl, sd-xl, sora, stable-diffusion, stablevideodiffusion, text-to-image, text-to-video
- Language: Python
- Homepage:
- Size: 188 MB
- Stars: 703
- Watchers: 22
- Forks: 221
- Open Issues: 150
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Citation: CITATION.cff
Awesome Lists containing this project
- Awesome-LLM-VLM-Foundation-Models - PP‑DocBee