Projects in Awesome Lists tagged with image-text
A curated list of projects in awesome lists tagged with image-text .
https://github.com/salesforce/albef
Code for ALBEF: a new vision-language pre-training method
contrastive-learning image-text representation-learning vision-and-language weakly-supervised-learning
Last synced: 08 Apr 2025
https://github.com/Sense-GVT/DeCLIP
Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm
big-model clip image-text multi-model self-supervised vision-language-pretraining zero-shot
Last synced: 03 Apr 2025
https://github.com/google/imageinwords
Data release for the ImageInWords (IIW) paper.
dataset dataset-generation detailed-annotations detailed-descriptions evaluation human-annotation i2t image-captioning image-descriptions image-text image-to-text t2i
Last synced: 26 Nov 2025
https://github.com/x-plug/mplug
mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections. (EMNLP 2022)
image-captioning image-text image-text-retrieval multimodal pretraining pytorch transformer visual-language vqa
Last synced: 26 Jun 2025
https://github.com/TheoCoombes/crawlingathome
A client library for LAION's effort to filter CommonCrawl with CLIP, building a large scale image-text dataset.
clip dall-e dataset dataset-generation image-text machine-learning
Last synced: 03 Apr 2025
https://github.com/antonlukin/poster-editor
Wrapper for PHP's GD Library for easy image manipulation. Support for scaling multi-line text, shapes, filters and smart resize.
composer image-processing image-text intervention php php-class php-gd php-image php-library poster-editor
Last synced: 07 Mar 2026
https://github.com/huangrunhua/livetextwithimage
WWDC22: Enabling Live Text interactions with images in SwiftUI
image-processing image-text live-text swift swiftui swiftui-demo swiftui-example wwdc wwdc22
Last synced: 12 Oct 2025
https://github.com/zabir-nabil/imagebert-keras
Keras implementation of ImageBERT from Microsoft
Last synced: 26 Jul 2025
https://github.com/reshalfahsi/image-captioning-mobilenet-llama3
Image Captioning With MobileNet-LLaMA 3
cnn flickr8k-dataset grouped-query-attention image-captioning image-text kv-cache llama3 mobilenetv3 nlp pytorch pytorch-lightning rms-norm rotary-position-embedding transformer
Last synced: 12 Apr 2025
https://github.com/leeyunjai/image2text
caption generator using lavis and argostranslate
blip2 caption caption-generation caption-generator captioning-images captions image-analysis image-text img2txt
Last synced: 03 Jul 2025
https://github.com/dngo-io/cover-creator
Write texts on images with php
image-manipulation image-processing image-text php textview
Last synced: 14 Jan 2026
https://github.com/dinhanhx/visualroberta
The first public Vietnamese visual linguistic foundation model(s)
image-captioning image-text python python-3 python3 vietnamese-nlp visual-linguistic visual-question-answering
Last synced: 13 May 2025
https://github.com/jianzhnie/multimodaltransformers
lmmtoolkit is a toolkit for Multi-Modal Learning
image-text multi-modal-learning text-image text-to-video
Last synced: 15 Sep 2025
https://github.com/dvlab-research/tagclip
clip image-text segmentation zero-shot
Last synced: 03 Jul 2025
https://github.com/dinhanhx/vl-datasets
Some Python scripts to load Vietnamese visual linguistic data
image-captioning image-text python python-3 python3 vietnamese vietnamese-nlp visual-linguistic visual-question-answering
Last synced: 23 Mar 2025
https://github.com/formulae-org/package-graphic-raster-js
Raster graphics package for Fōrmulæ, in JavaScript
formulae graphic-primitives graphics graphics-programming image-colors image-coordinates image-text image-transformations javascript raster-graphics rotating stroke-imaging turtle-graphics xor-mode
Last synced: 02 Apr 2025