Projects in Awesome Lists tagged with clip
A curated list of projects in awesome lists tagged with clip .
https://github.com/mikel-brostrom/boxmot
BoxMOT: pluggable SOTA tracking modules for segmentation, object detection and pose estimation models
boosttrack botsort bytetrack clip deep-learning deepocsort improvedassociation machine-learning mot mots multi-object-tracking multi-object-tracking-segmentation ocsort oriented-bounding-box-tracking osnet segmentation strongsort tensorrt tracking-by-detection yolo
Last synced: 24 Dec 2025
https://github.com/cvhub520/x-anylabeling
Effortless data labeling with AI support from Segment Anything and other awesome models.
annotation-tool classification clip deep-learning deeplearning depth-estimation grounding-dino image-segmentation labeling-tool llm matting object-detection onnx paddle pose-estimation pytorch resnet sam vlm yolo
Last synced: 13 May 2025
https://github.com/ofa-sys/chinese-clip
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
chinese clip computer-vision contrastive-loss coreml-models deep-learning image-text-retrieval multi-modal multi-modal-learning nlp pretrained-models pytorch transformers vision-and-language-pre-training vision-language
Last synced: 29 Apr 2025
https://github.com/OFA-Sys/Chinese-CLIP
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
chinese clip computer-vision contrastive-loss coreml-models deep-learning image-text-retrieval multi-modal multi-modal-learning nlp pretrained-models pytorch transformers vision-and-language-pre-training vision-language
Last synced: 02 Apr 2025
https://github.com/marqo-ai/marqo
Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai
chatgpt clip deep-learning gpt hacktoberfest hnsw information-retrieval knn large-language-models machine-learning machinelearning multi-modal natural-language-processing search-engine semantic-search tensor-search transformers vector-search vision-language visual-search
Last synced: 07 Jan 2026
https://github.com/easychen/pushdeer
开放源码的无App推送服务,iOS14+扫码即用。亦支持快应用/iOS和Mac客户端、Android客户端、自制设备
app clip notification-service push
Last synced: 12 Apr 2025
https://github.com/CVHub520/X-AnyLabeling
Effortless data labeling with AI support from Segment Anything and other awesome models.
clip deep-learning deeplearning labeling-tool llm onnx paddle pytorch resnet sam yolo
Last synced: 20 Mar 2025
https://github.com/open-mmlab/mmpretrain
OpenMMLab Pre-training Toolbox and Benchmark
beit clip constrastive-learning convnext deep-learning image-classification mae masked-image-modeling mobilenet moco multimodal pretrained-models pytorch resnet self-supervised-learning swin-transformer vision-transformer
Last synced: 24 Dec 2025
https://github.com/yuanzhoulvpi2017/zero_nlp
中文nlp解决方案(大模型、数据、模型、训练、推理)
bert chatglm-6b clip gpt gpt2 huggingface-transformers llama llama2 llava nlp pytorch text-generation transformers
Last synced: 14 May 2025
https://github.com/jingyi0000/vlm_survey
Collection of AWESOME vision-language models for vision tasks
clip computer-vision deep-learning knowledge-distillation multi-modal-model survey transfer-learning vision-language-model
Last synced: 14 Oct 2025
https://github.com/pharmapsychotic/clip-interrogator
Image to prompt with BLIP and CLIP
Last synced: 14 May 2025
https://github.com/rom1504/clip-retrieval
Easily compute clip embeddings and build a clip retrieval system with them
ai clip deep-learning knn multimodal semantic-search
Last synced: 14 May 2025
https://rom1504.github.io/clip-retrieval/?back=https%3A%2F%2Fknn5.laion.ai&index=laion5B&useMclip=false
Easily compute clip embeddings and build a clip retrieval system with them
ai clip deep-learning knn multimodal semantic-search
Last synced: 08 May 2025
https://github.com/open-compass/vlmevalkit
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
chatgpt claude clip computer-vision evaluation gemini gpt gpt-4v gpt4 large-language-models llava llm multi-modal openai openai-api pytorch qwen vit vqa
Last synced: 13 May 2025
https://github.com/cambrian-mllm/cambrian
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
chatbot clip computer-vision dino instruction-tuning large-language-models llms mllm multimodal-large-language-models representation-learning
Last synced: 14 May 2025
https://github.com/qin2dim/hcaptcha-challenger
🥂 Gracefully face hCaptcha challenge with multimodal large language model.
agent ai-agents captcha captcha-solver captcha-solving chatgpt clip gemini hcaptcha hcaptcha-solver llm openai playwright yolo
Last synced: 13 May 2025
https://github.com/QIN2DIM/hcaptcha-challenger
🥂 Gracefully face hCaptcha challenge with MoE(ONNX) embedded solution.
clip computer-vision hcaptcha hcaptcha-solver image-segmentation multi-modal multi-modal-learning object-detection onnx onnx-models onnxruntime opencv-python playwright solver yolo yolov5 zero-shot-classification
Last synced: 28 Mar 2025
https://github.com/mbzuai-oryx/video-chatgpt
[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.
chatbot clip gpt-4 llama llava mulit-modal vicuna video-chatboat video-conversation vision-language vision-language-pretraining
Last synced: 08 Oct 2025
https://github.com/mbzuai-oryx/Video-ChatGPT
[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.
chatbot clip gpt-4 llama llava mulit-modal vicuna video-chatboat video-conversation vision-language vision-language-pretraining
Last synced: 12 Mar 2025
https://github.com/open-compass/VLMEvalKit
Open-source evaluation toolkit of large vision-language models (LVLMs), support ~100 VLMs, 40+ benchmarks
chatgpt claude clip computer-vision evaluation gemini gpt gpt-4v gpt4 large-language-models llava llm multi-modal openai openai-api pytorch qwen vit vqa
Last synced: 20 Jul 2025
https://github.com/unum-cloud/uform
Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️
bert clip clustering contrastive-learning cross-attention huggingface-transformers image-search language-vision llava multi-lingual multimodal neural-network openai openclip pretrained-models pytorch representation-learning semantic-search transformer vector-search
Last synced: 14 May 2025
https://github.com/skalskip/vlms-zero-to-hero
This series will take you on a journey from the fundamentals of NLP and Computer Vision to the cutting edge of Vision-Language Models.
bert-model clip computer-vision embeddings gpt gpt-2 lora natural-language-processing seq2seq vision-language-model word2vec
Last synced: 06 Oct 2025
https://github.com/EdVince/Stable-Diffusion-NCNN
Stable Diffusion in NCNN with c++, supported txt2img and img2img
android clip cpp diffusion executable img2img mnn ncnn onnx stable-diffusion tensorrt tnn txt2img
Last synced: 13 Apr 2025
https://github.com/haltakov/natural-language-image-search
Search photos on Unsplash using natural language
clip computer-vision image-search machine-learning photos unsplash
Last synced: 01 Apr 2025
https://github.com/haltakov/natural-language-youtube-search
Search inside YouTube videos using natural language
clip computer-vision machine-learning search youtube
Last synced: 15 Mar 2025
https://github.com/omerbt/text2live
Official Pytorch Implementation for "Text2LIVE: Text-Driven Layered Image and Video Editing" (ECCV 2022 Oral)
clip eccv2022 generative-model image-editing image-manipulation single-image single-video text-driven-editing text2live video-editing
Last synced: 13 Apr 2025
https://github.com/omerbt/Text2LIVE
Official Pytorch Implementation for "Text2LIVE: Text-Driven Layered Image and Video Editing" (ECCV 2022 Oral)
clip eccv2022 generative-model image-editing image-manipulation single-image single-video text-driven-editing text2live video-editing
Last synced: 28 Mar 2025
https://github.com/hila-chefer/transformer-mm-explainability
[ICCV 2021- Oral] Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.
clip detr explainability explainable-ai interpretability lxmert transformer transformers visualbert visualization vqa
Last synced: 12 Apr 2025
https://github.com/hila-chefer/Transformer-MM-Explainability
[ICCV 2021- Oral] Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.
clip detr explainability explainable-ai interpretability lxmert transformer transformers visualbert visualization vqa
Last synced: 03 Apr 2025
https://github.com/ArrowLuo/CLIP4Clip
An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
activitynet clip didemo lsmdc msrvtt msvd multimodal multimodal-learning multimodality ranking retrieval retrieval-model search video-clip-retrieval video-text-retrieval
Last synced: 03 Apr 2025
https://github.com/eps696/aphantasia
CLIP + FFT/DWT/RGB = text to image/video
clip text-to-image text-to-video
Last synced: 07 Apr 2025
https://github.com/pengsongyou/openscene
[CVPR'23] OpenScene: 3D Scene Understanding with Open Vocabularies
3d-scene-understanding clip cvpr2023 llm matterport3d nuscenes point-cloud-segmentation point-clouds scannet semantic-segmentation
Last synced: 20 Mar 2025
https://github.com/Sense-GVT/DeCLIP
Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm
big-model clip image-text multi-model self-supervised vision-language-pretraining zero-shot
Last synced: 03 Apr 2025
https://github.com/pablosichert/react-truncate
React component for truncating multi-line spans and adding an ellipsis.
Last synced: 16 May 2025
https://github.com/leondgarse/keras_cv_attention_models
Keras beit,caformer,CMT,CoAtNet,convnext,davit,dino,efficientdet,edgenext,efficientformer,efficientnet,eva,fasternet,fastervit,fastvit,flexivit,gcvit,ghostnet,gpvit,hornet,hiera,iformer,inceptionnext,lcnet,levit,maxvit,mobilevit,moganet,nat,nfnets,pvt,swin,tinynet,tinyvit,uniformer,volo,vanillanet,yolor,yolov7,yolov8,yolox,gpt2,llama2, alias kecam
attention clip coco ddpm detection imagenet keras model recognition segment-anything stable-diffusion tensorflow tf tf2 visualizing
Last synced: 08 Apr 2025
https://github.com/microsoft/llm2clip
LLM2CLIP makes SOTA pretrained CLIP model more SOTA ever.
clip fundation-models multimodality
Last synced: 11 Apr 2025
https://github.com/v-iashin/video_features
Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and TIMM models.
audio-features clip feature-extraction i3d ig65m laion multi-gpu optical-flow parallel pytorch r2plus1d raft resnet s3d swin timm vggish video-features visual-features vit
Last synced: 02 Apr 2025
https://github.com/devhotteok/TwitchLink
Twitch Stream & Video & Clip Downloader/Recorder. This GUI downloader helps you download and record Twitch videos, including broadcasts and VODs.
broadcast clip downloader gui live m3u8 m3u8-downloader recorder stream twitch twitch-downloader video vod
Last synced: 16 May 2025
https://github.com/greyovo/picquery
🔍 Search local images with natural language on Android, powered by OpenAI's CLIP model. / 在 Android 上用自然语言搜索本地图片 (基于 OpenAI 的 CLIP 模型)
android clip image-text-retrieval image-text-search jetpack-compose material-design-3 openai
Last synced: 16 May 2025
https://github.com/xmed-lab/CLIP_Surgery
CLIP Surgery for Better Explainability with Enhancement in Open-Vocabulary Tasks
clip explainability interpretability multilabel multimodal open-vocabulary sam segment-anything segmentation vision-transformer
Last synced: 16 Mar 2025
https://github.com/iceclear/clip-iqa
[AAAI 2023] Exploring CLIP for Assessing the Look and Feel of Images
Last synced: 06 Apr 2025
https://github.com/microsoft/LLM2CLIP
LLM2CLIP makes SOTA pretrained CLIP model more SOTA ever.
clip fundation-models multimodality
Last synced: 10 Aug 2025
https://github.com/OpenGVLab/Instruct2Act
Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model
chatgpt clip llm robotics segment-anything
Last synced: 06 May 2025
https://github.com/Chrisvin/EasyReveal
Android Easy Reveal Library
android android-library clip easy easyreveal library reveal reveal-animations
Last synced: 12 Apr 2025
https://github.com/opengvlab/instruct2act
Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model
chatgpt clip llm robotics segment-anything
Last synced: 20 Apr 2025
https://github.com/wisconsinaivision/vip-llava
[CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts
chatbot clip cvpr2024 foundation-models gpt-4 gpt-4-vision llama llama2 llava multi-modal vision-language visual-prompting
Last synced: 06 Apr 2025
https://github.com/baaivision/eve
EVE Series: Encoder-Free Vision-Language Models from BAAI
clip encoder-free-vlm instruction-following large-language-models llm mllm multimodal-large-language-models vision-language-models vlm
Last synced: 12 Apr 2025
https://github.com/poloclub/diffusion-explainer
Diffusion Explainer: Visual Explanation for Text-to-image Stable Diffusion
clip deep-learning generative-model interactive-visualization machine-learning stable-diffusion unet visual-learning visualization
Last synced: 13 May 2025
https://liruiw.github.io/gensim/
Generating Robotic Simulation Tasks via Large Language Models
clip gpt-4 llm pybullet simulation
Last synced: 08 Apr 2025
https://github.com/paddlepaddle/passl
PASSL包含 SimCLR,MoCo v1/v2,BYOL,CLIP,PixPro,simsiam, SwAV, BEiT,MAE 等图像自监督算法以及 Vision Transformer,DEiT,Swin Transformer,CvT,T2T-ViT,MLP-Mixer,XCiT,ConvNeXt,PVTv2 等基础视觉算法
beit clip convnext cvt deep-learning deit mae moco moco-v2 paddle pixpro pvt self-supervised-learning simclr swav swin-transformer vision-transformer vit xcit
Last synced: 04 Apr 2025
https://github.com/mertyg/vision-language-models-are-bows
Experiments and data for the paper "When and why vision-language models behave like bags-of-words, and what to do about it?" Oral @ ICLR 2023
blip clip compositionality multimodal pytorch vision-language
Last synced: 25 Sep 2025
https://github.com/baaivision/diva
[ICLR 2025] Diffusion Feedback Helps CLIP See Better
clip diffusion visual-perception
Last synced: 08 Oct 2025
https://github.com/mbzuai-oryx/videogpt-plus
Official Repository of paper VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding
chatbot clip dual-encoder gpt4 gpt4o image-encoder llama3 llava multimodal phi-3-mini vicuna video-chatbot video-conversation video-encoder vision-language vision-language-pretraining
Last synced: 07 Apr 2025
https://github.com/yxuansu/MAGIC
Language Models Can See: Plugging Visual Controls in Text Generation
clip gpt-2 image-captioning multimodal plug-and-play-language-models story-generation text-generation unsupervised-learning zero-shot
Last synced: 27 Apr 2025
https://github.com/j-min/clip-caption-reward
PyTorch code for "Fine-grained Image Captioning with CLIP Reward" (Findings of NAACL 2022)
clip image-captioning reinforcement-learning vision-and-language
Last synced: 10 Apr 2025
https://github.com/taited/clip-score
Quick scripts to calculate CLIP text-image similarity
Last synced: 16 May 2025
https://github.com/hila-chefer/targetclip
[ECCV 2022] Official PyTorch implementation of the paper Image-Based CLIP-Guided Essence Transfer.
clip computer-graphics eccv2022 image-editing image-generation image-manipulation stylegan2
Last synced: 08 May 2025
https://github.com/kyegomez/navit
My implementation of "Patch n’ Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution"
attention-mechanism clip gpt4 multimodal multimodal-deep-learning multimodal-learning multimodality vit
Last synced: 16 May 2025
https://github.com/mbzuai-oryx/VideoGPT-plus
Official Repository of paper VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding
chatbot clip dual-encoder gpt4 gpt4o image-encoder llama3 llava multimodal phi-3-mini vicuna video-chatbot video-conversation video-encoder vision-language vision-language-pretraining
Last synced: 10 Aug 2025
https://github.com/chao1224/MoleculeSTM
Multi-modal Molecule Structure-text Model for Text-based Editing and Retrieval, Nat Mach Intell 2023 (https://www.nature.com/articles/s42256-023-00759-6)
clip computation-chemistry drug-discovery editing foundation-model molecule-editing moleculeclip moleculestm pretraining retrieval
Last synced: 09 May 2025
https://github.com/chao1224/moleculestm
Multi-modal Molecule Structure-text Model for Text-based Editing and Retrieval, Nat Mach Intell 2023 (https://www.nature.com/articles/s42256-023-00759-6)
clip computation-chemistry drug-discovery editing foundation-model molecule-editing moleculeclip moleculestm pretraining retrieval
Last synced: 13 Apr 2025
https://github.com/zer0int/clip-fine-tune
Fine-tuning code for CLIP models
clip comfyui fine-tune fine-tuning finetune openai sdxl textencoder
Last synced: 28 Apr 2025
https://github.com/haofanwang/natural-language-joint-query-search
Search photos on Unsplash based on OpenAI's CLIP model, support search with joint image+text queries and attention visualization.
attention clip computer-vision image-retrieval image-search multi-modal-search unsplash visualizations
Last synced: 20 Aug 2025
https://github.com/paddlepaddle/paddlemix
Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.
aigc blip2 clip controlnet dit eva-clip image-to-text llava minigpt4 multimodal ppdiffusers qwen-vl sd-xl sora stable-diffusion stablevideodiffusion text-to-image text-to-video
Last synced: 04 Apr 2025
https://github.com/seeed-projects/tutorial-of-ai-kit-with-raspberry-pi-from-zero-to-hero
This repository provides a comprehensive step-by-step guide to building AI projects using the Raspberry Pi AI Kit.
clip computer-vision hailo8 instance-segmentation object-detection ollama pose-estimation raspberry-pi
Last synced: 04 Apr 2025
https://github.com/Imageomics/bioclip
This is the repository for the BioCLIP model and the TreeOfLife-10M dataset [CVPR'24 Oral, Best Student Paper].
clip computer-vision imageomics knowledge-guided-machine-learning taxonomy
Last synced: 05 Apr 2025
https://github.com/pengtaojiang/segment-anything-clip
Connecting segment-anything's output masks with the CLIP model; Awesome-Segment-Anything-Works
classification clip segment-anything semantic-segmentation
Last synced: 04 Apr 2025
https://github.com/josephrocca/clip-image-sorter
Sort a folder of images according to their similarity with provided text in your browser (uses a browser-ported version of OpenAI's CLIP model and the web's new File System Access API)
clip file-system-access-api openai openai-clip
Last synced: 03 Apr 2025
https://github.com/miccunifi/SEARLE
[ICCV 2023] - Zero-shot Composed Image Retrieval with Textual Inversion
circo cirr clip composed-image-retrieval fashion-iq knowledge-distillation multimodal-learning pytorch textual-inversion
Last synced: 03 Apr 2025
https://github.com/laion-ai/scaling-laws-openclip
Reproducible scaling laws for contrastive language-image learning (https://arxiv.org/abs/2212.07143)
clip deep-learning few-shot-learning fine-tuning laion openclip pre-training pytorch scaling-laws transfer-learning zero-shot-classification zero-shot-retrieval
Last synced: 07 May 2025
https://github.com/fcjian/PromptDet
PromptDet: Towards Open-vocabulary Detection using Uncurated Images, ECCV2022
clip computer-vision eccv2022 novel-categories object-detection prompt-learning pseudo-labeling regional-prompt self-training vocabulary web-image zero-shot-learning
Last synced: 15 Jun 2025
https://github.com/minimaxir/imgbeddings
Python package to generate image embeddings with CLIP without PyTorch/TensorFlow
ai clip embeddings image-processing images onnx transformers
Last synced: 09 Apr 2025
https://github.com/ai-forever/ru-clip
CLIP implementation for Russian language
Last synced: 20 Jun 2025
https://github.com/eddieoz/youtube-clips-automator
MARCELO: an AI powered bot to automate the editing and thumbnail creation for your Youtube clips channel
ai audio-processing automation bot clip computer-vision editing thumbnail video video-processing youtube
Last synced: 20 Oct 2025
https://github.com/Shishkebaboo/VodRecovery
The purpose of this script is to obtain videos or clips that are either marked as "sub-only" or have been deleted on Twitch.
broadcast clip clips commad-line commandline console development ffmpeg live m3u8 m3u8-playlist m3u8-videos mp4 python recover twitch twitchclips twitchtv vodrecovery
Last synced: 18 Jul 2025
https://github.com/jamjamjon/usls
A Rust library integrated with ONNXRuntime, providing a collection of Computer Vison and Vision-Language models.
clip cuda florence2 grounding-dino imshow moondream ocr onnx onnxruntime rust-yolo sam sapiens smolvlm tensorrt yolo yolo-rs yolo-rust yolov10 yolov11 yolov8
Last synced: 16 May 2025
https://github.com/ylqi/Count-Anything
This method uses Segment Anything and CLIP to ground and count any object that matches a custom text prompt, without requiring any point or box annotation.
clip count-anything segment-anything
Last synced: 23 Aug 2025
https://github.com/hv0905/nekoimagegallery
An AI-powered natural language & reverse Image Search Engine powered by CLIP & qdrant.
clip computer-vision image-search image-search-engine search-engine transformers
Last synced: 06 Apr 2025
https://github.com/HFAiLab/clip-gen
CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP
clip pytorch text-to-image text2image
Last synced: 03 Apr 2025
https://github.com/ajatt-tools/videoclip
🍗 Easily create videoclips with mpv.
addon ajatt audioclip clip mpv mpv-script videoclip
Last synced: 02 Nov 2025
https://github.com/skalskip/transformers
Everything you need to know about Transformers! 🤖
attention-mechanism clip detr gpt transformers visual-transformer
Last synced: 11 Jul 2025
https://github.com/soulteary/simple-image-search-engine
图片搜索引擎,很简单。三步构建属于你自己的图片搜索引擎,掌握向量数据库和以图搜图、文本搜索图片。
clip docker image-search-engine image-similarity picture-search redis redis-vector-search search-engine vector-database vits
Last synced: 06 Jul 2025
https://github.com/DRSY/MoTIS
[NAACL 2022]Mobile Text-to-Image search powered by multimodal semantic representation models(e.g., OpenAI's CLIP)
ai clip cross-modal image-search ios-swift k-means k-means-clustering knn knowledge-distillation lsh naacl random-projection retrieval semantic-search vector-search
Last synced: 08 May 2025
https://github.com/wangrongding/WebCut
🎬 基于 web 端的音视频编辑器。(A web-based audio and video editor.)
audio audio-editor audio-processing clip cut video video-editor video-processing wasm webcodecs
Last synced: 24 Mar 2025
https://github.com/wangrongding/webcut
🎬 基于 web 端的音视频编辑器。(A web-based audio and video editor.)
audio audio-editor audio-processing clip cut video video-editor video-processing wasm webcodecs
Last synced: 29 Oct 2025
https://github.com/Ajatt-Tools/videoclip
🍗 Easily create videoclips with mpv.
addon ajatt audioclip clip mpv mpv-script videoclip
Last synced: 10 Jul 2025
https://github.com/salesforce/MUST
PyTorch code for MUST
clip masked-image-modeling self-training unsupervised-learning zero-shot-classification zero-shot-learning
Last synced: 08 May 2025
https://github.com/salesforce/must
PyTorch code for MUST
clip masked-image-modeling self-training unsupervised-learning zero-shot-classification zero-shot-learning
Last synced: 15 Apr 2025
https://github.com/tnwei/vqgan-clip-app
Local image generation using VQGAN-CLIP or CLIP guided diffusion
clip deep-learning generative-art guided-diffusion image-generation streamlit text2image vqgan-clip
Last synced: 13 Apr 2025
https://github.com/foolwood/drl
[arXiv22] Disentangled Representation Learning for Text-Video Retrieval
clip interaction-nets text-video-search-engine transformer video-retrieval
Last synced: 30 Aug 2025
https://github.com/nvidia-ai-iot/clip-distillation
Zero-label image classification via OpenCLIP knowledge distillation
clip distillation inference jetson knowledge nvidia qat sparsity tensorrt
Last synced: 13 Oct 2025
https://github.com/aerobounce/trim.lua
Trim mode for mpv — Turn mpv into Lossless Audio / Video Editor
clip concat ffmpeg lossless lua lua-script mpv mpv-script trim video video-editor video-processing
Last synced: 10 Jul 2025
https://github.com/marqo-ai/marqo-fashionclip
State-of-the-art CLIP/SigLIP embedding models finetuned for the fashion domain. +57% increase in evaluation metrics vs FashionCLIP 2.0.
clip embeddings fashion-classifier fashionclip informationretrieval multimodal recomendations search transformers vectorsearch vision-transformer
Last synced: 30 Jul 2025
https://github.com/sajjjadayobi/CLIPfa
CLIPfa: Connecting Farsi Text and Images
clip farsi farsi-datasets image-search openai-clip persian-nlp zero-shot-learning
Last synced: 08 Jul 2025
https://github.com/pansyjs/video-editing-timeline
Timeline for video editing(为视频编辑而写时间线)
clip cut editing timeline video video-clip video-cut video-editing
Last synced: 10 Apr 2025