Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists tagged with llava
A curated list of projects in awesome lists tagged with llava .
https://github.com/haotian-liu/llava
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
chatbot chatgpt foundation-models gpt-4 instruction-tuning llama llama-2 llama2 llava multi-modality multimodal vision-language-model visual-language-learning
Last synced: 16 Dec 2024
https://github.com/haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
chatbot chatgpt foundation-models gpt-4 instruction-tuning llama llama-2 llama2 llava multi-modality multimodal vision-language-model visual-language-learning
Last synced: 25 Oct 2024
https://github.com/sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
cuda inference llama llama2 llama3 llama3-1 llava llm llm-serving moe pytorch transformer vlm
Last synced: 16 Dec 2024
https://github.com/fanghua-yu/supir
SUPIR aims at developing Practical Algorithms for Photo-Realistic Image Restoration In the Wild. Our new online demo is also released at suppixel.ai.
deep-learning diffusion-models llava pytorch pytorch-lightning restoration sdxl stable-diffusion super-resolution
Last synced: 17 Dec 2024
https://github.com/Fanghua-Yu/SUPIR
SUPIR aims at developing Practical Algorithms for Photo-Realistic Image Restoration In the Wild. Our new online demo is also released at suppixel.ai.
deep-learning diffusion-models llava pytorch pytorch-lightning restoration sdxl stable-diffusion super-resolution
Last synced: 30 Oct 2024
https://github.com/modelscope/ms-swift
Use PEFT or Full-parameter to finetune 350+ LLMs or 100+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL, Phi3.5-Vision, ...)
agent deploy dpo internvl liger llama llama3 llava llm lora megatron minicpm-v modelscope multimodal peft pre-training qwen2 qwen2-vl reflection sft
Last synced: 17 Dec 2024
https://github.com/internlm/xtuner
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
agent baichuan chatbot chatglm2 chatglm3 conversational-ai internlm large-language-models llama2 llama3 llava llm llm-training mixtral msagent peft phi3 qwen supervised-finetuning
Last synced: 16 Dec 2024
https://github.com/InternLM/xtuner
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
agent baichuan chatbot chatglm2 chatglm3 conversational-ai internlm large-language-models llama2 llama3 llava llm llm-training mixtral msagent peft phi3 qwen supervised-finetuning
Last synced: 28 Oct 2024
https://github.com/modelscope/data-juicer
A one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据!
chinese data-analysis data-science data-visualization dataset gpt gpt-4 instruction-tuning large-language-models llama llava llm llms multi-modal nlp opendata pre-training pytorch sora streamlit
Last synced: 18 Dec 2024
https://github.com/scisharp/llamasharp
A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.
chatbot gpt llama llama-cpp llama2 llama3 llamacpp llava llm multi-modal semantic-kernel
Last synced: 17 Dec 2024
https://github.com/SciSharp/LLamaSharp
A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.
chatbot gpt llama llama-cpp llama2 llama3 llamacpp llava llm multi-modal semantic-kernel
Last synced: 28 Oct 2024
https://github.com/open-compass/vlmevalkit
Open-source evaluation toolkit of large vision-language models (LVLMs), support ~100 VLMs, 40+ benchmarks
chatgpt claude clip computer-vision evaluation gemini gpt gpt-4v gpt4 large-language-models llava llm multi-modal openai openai-api pytorch qwen vit vqa
Last synced: 21 Dec 2024
https://github.com/mbzuai-oryx/video-chatgpt
[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.
chatbot clip gpt-4 llama llava mulit-modal vicuna video-chatboat video-conversation vision-language vision-language-pretraining
Last synced: 19 Dec 2024
https://github.com/open-compass/VLMEvalKit
Open-source evaluation toolkit of large vision-language models (LVLMs), support ~100 VLMs, 40+ benchmarks
chatgpt claude clip computer-vision evaluation gemini gpt gpt-4v gpt4 large-language-models llava llm multi-modal openai openai-api pytorch qwen vit vqa
Last synced: 28 Nov 2024
https://github.com/mbzuai-oryx/Video-ChatGPT
[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.
chatbot clip gpt-4 llama llava mulit-modal vicuna video-chatboat video-conversation vision-language vision-language-pretraining
Last synced: 24 Oct 2024
https://github.com/unum-cloud/uform
Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️
bert clip clustering contrastive-learning cross-attention huggingface-transformers image-search language-vision llava multi-lingual multimodal neural-network openai openclip pretrained-models pytorch representation-learning semantic-search transformer vector-search
Last synced: 18 Dec 2024
https://github.com/mbzuai-oryx/llava-pp
🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)
conversation llama-3-llava llama-3-vision llama3 llama3-llava llama3-vision llava llava-llama3 llava-phi3 llm lmms phi-3-llava phi-3-vision phi3 phi3-llava phi3-vision vision-language
Last synced: 20 Dec 2024
https://github.com/mbzuai-oryx/LLaVA-pp
🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)
conversation llama-3-llava llama-3-vision llama3 llama3-llava llama3-vision llava llava-llama3 llava-phi3 llm lmms phi-3-llava phi-3-vision phi3 phi3-llava phi3-vision vision-language
Last synced: 08 Nov 2024
https://github.com/jhc13/taggui
Tag manager and captioner for image datasets
cogvlm florence-2 image-captioning image-tagging llava pyside6 stable-diffusion tag-manager
Last synced: 12 Dec 2024
https://github.com/psychip/machina
OpenCV+YOLO+LLAVA powered video surveillance system
camera llava ollama-api opencv python rtsp yolo
Last synced: 21 Dec 2024
https://github.com/TinyLLaVA/TinyLLaVA_Factory
A Framework of Small-scale Large Multimodal Models
large-multimodal-models llama llava nlp tinyllama transformers vision-language
Last synced: 13 Nov 2024
https://github.com/blaizzy/mlx-vlm
MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.
apple-silicon florence2 idefics llava llm local-ai mlx molmo paligemma pixtral vision-framework vision-language-model vision-transformer
Last synced: 19 Dec 2024
https://github.com/nvlabs/eagle
EAGLE: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
demo eagle gpt4 huggingface large-language-models llama llama3 llava llm lmm lvlm mllm nvdia
Last synced: 21 Dec 2024
https://github.com/Blaizzy/mlx-vlm
MLX-VLM is a package for running Vision LLMs locally on your Mac using MLX.
apple-silicon florence2 idefics llava llm local-ai mlx molmo paligemma pixtral vision-framework vision-language-model vision-transformer
Last synced: 25 Nov 2024
https://github.com/NVlabs/EAGLE
EAGLE: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
demo eagle gpt4 huggingface large-language-models llama llama3 llava llm lmm lvlm mllm nvdia
Last synced: 26 Sep 2024
https://github.com/nrl-ai/llama-assistant
AI-powered assistant to help you with your daily tasks, powered by Llama 3.2. It can recognize your voice, process natural language, and perform various actions based on your commands: summarizing text, rephasing sentences, answering questions, writing emails, and more.
llama llama-3-2 llama3 llava moondream owen personal-assistant private-gpt
Last synced: 15 Dec 2024
https://github.com/jakobdylanc/llmcord
A Discord LLM chat bot that supports any OpenAI compatible API (Ollama, LM Studio, vLLM, OpenRouter, xAI, Mistral, Groq and more)
bot chatbot chatgpt discord gpt gpt-4 gpt-4o grok groq llama llama3 llava llm lmstudio mistral ollama oobabooga openai vllm xai
Last synced: 15 Dec 2024
https://github.com/apocas/restai
RestAI is an AIaaS (AI as a Service) open-source platform. Built on top of LlamaIndex, Ollama and HF Pipelines. Supports any public LLM supported by LlamaIndex and any local LLM suported by Ollama. Precise embeddings usage and tuning.
embeddings fastapi langchain llama llamaindex llava llm ollama openai openaiapi python rag stable-diffusion transformers
Last synced: 14 Dec 2024
https://github.com/wisconsinaivision/vip-llava
[CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts
chatbot clip cvpr2024 foundation-models gpt-4 gpt-4-vision llama llama2 llava multi-modal vision-language visual-prompting
Last synced: 15 Dec 2024
https://github.com/internlm/internevo
InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencies.
910b deepspeed-ulysses flash-attention gemma internlm internlm2 llama3 llava llm-framework llm-training multi-modal pipeline-parallelism pytorch ring-attention sequence-parallelism tensor-parallelism transformers-models zero3
Last synced: 14 Dec 2024
https://github.com/InternLM/InternEvo
InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencies.
910b deepspeed-ulysses flash-attention gemma internlm internlm2 llama3 llava llm-framework llm-training multi-modal pipeline-parallelism pytorch ring-attention sequence-parallelism tensor-parallelism transformers-models zero3
Last synced: 30 Oct 2024
https://github.com/vietanhdev/llama-assistant
AI-powered assistant to help you with your daily tasks, powered by Llama 3.2. It can recognize your voice, process natural language, and perform various actions based on your commands: summarizing text, rephasing sentences, answering questions, writing emails, and more.
llama llama-3-2 llama3 llava moondream owen personal-assistant private-gpt
Last synced: 14 Oct 2024
https://github.com/developersdigest/ai-devices
AI Device Template Featuring Whisper, TTS, Groq, Llama3, OpenAI and more
function-calling gpt-4-vision groq langchain langsmith llama3 llava llm openai serper tts whisper
Last synced: 16 Dec 2024
https://github.com/jakobdylanc/llmcord.py
A Discord LLM chat bot that supports any OpenAI compatible API. Run a local model with ollama, oobabooga, Jan and more
ai bot chatbot chatgpt discord gpt gpt-4 gpt-4o groq litellm llama llama3 llava llm llmcord lmstudio mistral ollama oobabooga openai
Last synced: 10 Oct 2024
https://github.com/FuxiaoLiu/LRV-Instruction?tab=readme-ov-file
[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning
chatgpt evaluation evaluation-metrics foundation-models gpt gpt-4 hallucination iclr iclr2024 llama llava multimodal object-detection prompt-engineering vicuna vision vision-and-language vqa
Last synced: 01 Nov 2024
https://github.com/gokayfem/comfyui_vlm_nodes
Custom ComfyUI nodes for Vision Language Models, Large Language Models, Image to Music, Text to Music, Consistent and Random Creative Prompt Generation
comfyui custom-nodes image-captioning img2sfx img2text joytag llava llm mllm nodes phi15 siglip vlm
Last synced: 17 Dec 2024
https://github.com/gokayfem/ComfyUI_VLM_nodes
Custom ComfyUI nodes for Vision Language Models, Large Language Models, Image to Music, Text to Music, Consistent and Random Creative Prompt Generation
comfyui custom-nodes image-captioning img2sfx img2text joytag llava llm mllm nodes phi15 siglip vlm
Last synced: 22 Nov 2024
https://github.com/mbzuai-oryx/videogpt-plus
Official Repository of paper VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding
chatbot clip dual-encoder gpt4 gpt4o image-encoder llama3 llava multimodal phi-3-mini vicuna video-chatbot video-conversation video-encoder vision-language vision-language-pretraining
Last synced: 20 Dec 2024
https://github.com/mbzuai-oryx/VideoGPT-plus
Official Repository of paper VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding
chatbot clip dual-encoder gpt4 gpt4o image-encoder llama3 llava multimodal phi-3-mini vicuna video-chatbot video-conversation video-encoder vision-language vision-language-pretraining
Last synced: 12 Dec 2024
https://github.com/paddlepaddle/paddlemix
Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.
aigc blip2 clip controlnet dit eva-clip image-to-text llava minigpt4 multimodal ppdiffusers qwen-vl sd-xl sora stable-diffusion stablevideodiffusion text-to-image text-to-video
Last synced: 20 Dec 2024
https://github.com/rlhf-v/rlaif-v
RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness
chatbot gpt-4v llava llava-next minicpm-v multimodal rlaif-v vision-language-learning
Last synced: 17 Sep 2024
https://github.com/RLHF-V/RLAIF-V
RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness
chatbot gpt-4v llava llava-next minicpm-v multimodal rlaif-v vision-language-learning
Last synced: 03 Nov 2024
https://github.com/gbaptista/ollama-ai
A Ruby gem for interacting with Ollama's API that allows you to run open source AI LLMs (Large Language Models) locally.
ai alpaca bakllava dolphin llama llama2 llava llm mistral mistral-ai mixtral nano-bots ollama ollama-api openorca vicuna
Last synced: 14 Dec 2024
https://github.com/sshh12/multi_token
Embed arbitrary modalities (images, audio, documents, etc) into large language models.
large-context large-language-models large-multimodal-models llava llm multi-modality multimodal vision-language-model
Last synced: 17 Nov 2024
https://github.com/trzy/llava-cpp-server
LLaVA server (llama.cpp).
llama llama2 llava llm multimodal vision-transformer
Last synced: 17 Dec 2024
https://github.com/zjysteven/lmms-finetune
A minimal codebase for finetuning large multimodal models, supporting llava-1.5/1.6, llava-interleave, llava-next-video, llava-onevision, qwen-vl, qwen2-vl, phi3-v etc.
finetuning foundation-models instruction-tuning large-language-model large-multimodal-models llava llava-next multimodal multimodal-large-language-models qwen-vl vision-language visual-instruction-tuning
Last synced: 21 Dec 2024
https://github.com/tianyi-lab/hallusionbench
[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
benchmark benchmarks gpt-4 gpt-4v hallucination large-language-models large-vision-language-models llava llm lmm vlms
Last synced: 09 Oct 2024
https://github.com/aimagelab/llava-more
LLaVA-MORE: Enhancing Visual Instruction Tuning with LLaMA 3.1
llama3 llama3-1 llama3-vision llava llava-llama3 llms multimodal-llms vision-and-language
Last synced: 16 Dec 2024
https://github.com/shikiw/modality-integration-rate
The official code of the paper "Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration Rate".
chatbot gpt-4o large-multimodal-models llama llava multimodal vision-language-learning vision-language-model
Last synced: 17 Dec 2024
https://github.com/thomas-yanxin/karmavlm
🧘🏻♂️KarmaVLM (相生):A family of high efficiency and powerful visual language model.
llama2 llava multimodel qwen2 vision-language-model visual-language-learning vlm
Last synced: 06 Dec 2024
https://github.com/niutrans/vision-llm-alignment
This repository contains the code for SFT, RLHF, and DPO, designed for vision-based LLMs, including the LLaVA models and the LLaMA-3.2-vision models.
alignment dpo llama3-vision llava llm mllm multi-model ppo reward rlhf sft vision
Last synced: 18 Nov 2024
https://github.com/fly-apps/ollama-open-webui
Self-host a ChatGPT-style web interface for Ollama 🦙
ai gemma gpu llama3 llava mistral mixtral ollama ollama-webui
Last synced: 17 Dec 2024
https://github.com/ashleykleynhans/llava-docker
Docker image for LLaVA: Large Language and Vision Assistant
ai chatbot chatgpt docker docker-image foundation-models gpt-4 instruction-tuning llama llama-2 llama2 llava llm multimodal runpod vision-language-model visual-language-learning
Last synced: 25 Nov 2024
https://github.com/buaadreamer/chinese-llava-med
中文医学多模态大模型 Large Chinese Language-and-Vision Assistant for BioMedicine
ai chinese gpt4v huggingface-datasets llama-factory llava medical minigpt4 mllm multimodal qwen1-5 transformers
Last synced: 06 Dec 2024
https://github.com/kwaivgi/uniaa
Unified Multi-modal IAA Baseline and Benchmark
benchmark dataset image-aesthetic-assessment llava mllm
Last synced: 09 Nov 2024
https://github.com/herrera-luis/vision-core-ai
Demo python script app to interact with llama.cpp server using whisper API, microphone and webcam devices.
bakllava llamacpp llava whisper-ai
Last synced: 02 Dec 2024
https://github.com/wisconsinaivision/yollava
🌋👵🏻 Yo'LLaVA: Your Personalized Language and Vision Assistant
llava llm llms lmm lmms multi-modal-models personalization personalized
Last synced: 10 Nov 2024
https://github.com/mapluisch/llava-cli-with-multiple-images
LLaVA inference with multiple images at once for cross-image analysis.
image-concatenation image-processing inference llama2 llama2-13b llava lmm lmms pillow python python3 pytorch visual-question-answering vqa
Last synced: 13 Nov 2024
https://github.com/paradoxzw/llava-uhd-better
A bug-free and improved implementation of LLaVA-UHD, based on the code from the official repo
large-language-models large-multimodal-models llava multimodal
Last synced: 19 Dec 2024
https://github.com/uminosachi/open-llm-webui
This repository contains a web application designed to execute relatively compact, locally-operated Large Language Models (LLMs).
chatbot ggml gradio huggingface language-model llama llama2 llama3 llava llava-llama3 llm nlp transformers
Last synced: 10 Oct 2024
https://github.com/buaadreamer/mllm-finetuning-demo
使用LLaMA-Factory微调多模态大语言模型的示例代码 Demo of Finetuning Multimodal LLM with LLaMA-Factory
finetune-llm huggingface-datasets llama-factory llava lora mllm paligemma pretraining supervised-finetuning transformers yi-vl
Last synced: 06 Dec 2024
https://github.com/blib-la/captain
Your all-in-one platform to build and use AI apps effortlessly on your own computer.
ai artificial-intelligence blip booru-tags captioning-images clip dataset-generation datasets generative-ai human-in-the-loop llava llm lora model-training sdk sdxl stable-diffusion toolkit
Last synced: 06 Dec 2024
https://github.com/buaadreamer/spn4cir
[ACM MM 2024] Improving Composed Image Retrieval via Contrastive Learning with Scaling Positives and Negatives
acmmm2024 blip blip2 clip composed-image-retrieval cross-modal-retrieval data-generation image-retrieval llama llava memory-bank multi-modal-retrieval multimodal-learning transformer
Last synced: 06 Dec 2024
https://github.com/fmxexpress/ai-vision-chat
Chat with large languages models about the contents of an image via this native desktop client for Windows, macOS, and Linux.
ai delphi delphi-sample desktop-app linux llava llm macos replicate-api vicuna vision-language-model windows
Last synced: 15 Oct 2024
https://github.com/ashleykleynhans/supir-docker
Docker image for SUPIR (Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild)
deep-learning diffusion-models docker docker-image llava pytorch pytorch-lightning restoration runpod sdxl stable-diffusion super-resolution upscaling
Last synced: 05 Nov 2024
https://github.com/zjysteven/vlm-visualizer
Visualizing the attention of vision-language models
attention attention-mechanism llava multi-modal vision-language vision-language-model
Last synced: 02 Nov 2024
https://github.com/zhudotexe/kani-vision
Kani extension for supporting vision-language models (VLMs). Comes with model-agnostic support for GPT-Vision and LLaVA.
extension gpt-vision kani large-language-models llava multimodal-llm vision-language-model
Last synced: 27 Oct 2024
https://github.com/xyproto/describeimage
Describe images by using LLMs
command-line-utility describe-image large-language-model llava llm llm-manager ollama ollama-client
Last synced: 09 Nov 2024
https://github.com/iamaziz/chat_with_images
Streamlit app to chat with images using Multi-modal LLMs.
llava llms multimodal-llm streamlit
Last synced: 09 Nov 2024
https://github.com/leo5imon/magi
Meme search engine for the real shitposters, powered by AI & Llava 13b.
ai image-classification llava memes replicate search-engine
Last synced: 18 Dec 2024
https://github.com/piotrbania/ai_image_search
AI assisted image search, checks your images on hard drive and tries to find whether they match the thing you are looking for (this is OFFLINE processing, no data leaves your computer)
ai image-processing imagerecognition llava model nexa
Last synced: 20 Nov 2024
https://github.com/saharmor/monsterbooth
Turn yourself into a Halloween-styled character and get an original roast with the power of AI.
generative-ai gpt4 llava multimodal
Last synced: 13 Nov 2024
https://github.com/cloudmedicio/describe-media-library
Describe images in the WordPress Media Library using a local large language model. Generates titles, captions, descriptions, and alt tags.
accessibility alt-text alt-text-generator apple-silicon computer-vision gpt4 intel-mac linux llama2 llamacpp llava localwp media-library ollama openai plugin search-engine-optimization seo wordpress wp-cli
Last synced: 05 Nov 2024
https://github.com/autodistill/autodistill-llava
LLaVA base model for use with Autodistill.
autodistill computer-vision llava multimodal-llm
Last synced: 08 Nov 2024
https://github.com/ergonomech/ollama-model-interaction
A simple Gradio-based app for interacting with Ollama models, supporting image analysis, text completion, and model pullin
gradio llava ollama ollama-api vision-api
Last synced: 03 Dec 2024
https://github.com/agarzon/ollama-image-caption
caption flux llava ollama stable-diffusion
Last synced: 10 Oct 2024
https://github.com/muhfaridansutariya/llava-1.5-liveness-7b
Resigned Yann-LeCun
Last synced: 07 Dec 2024
https://github.com/ugurkantech/archnetai
ArchNetAI is a Python library that leverages the Ollama API for generating AI-powered content.
ai json-schema llama llava ollama ollama-api phi3 python
Last synced: 14 Oct 2024
https://github.com/maheshj01/image-captioning-using-llava-and-llama3
lmage Caption Generator using llava and llama3 through the ollama library
Last synced: 28 Nov 2024
https://github.com/gurpreetkaurjethra/multimodal-ai-app-using-llava-7b
Multimodal AI App using Llava 7B and Gradio
ai generative-ai gradio large-language-models llava llavacpp llm multimodal voice-assistant whisper
Last synced: 22 Nov 2024
https://github.com/ibnaleem/mikael
the open-sourced repository for Mikael, a Discord chatbot trained on Mistral and LLaVA language models
artificial-intelligence chatbot discord-bot discord-py gpt-4 large-language-models llava mistral mistral-7b mistral-ai multimodal multimodal-deep-learning
Last synced: 15 Dec 2024
https://github.com/robert-mcdermott/LLM-Image-Classification
Image Classification Testing with LLMs
image-classification llava llm multimodal
Last synced: 20 Oct 2024
https://github.com/ravi-teja-konda/tunedllavadelights
Explore the rich flavors of Indian desserts with TunedLlavaDelights. Utilizing the in Llava fine-tuning, our project unveils detailed nutritional profiles, taste notes, and optimal consumption times for beloved sweets. Dive into a fusion of AI innovation and culinary tradition
chatgpt dalle2 dessert finetuning gpt4 gpt4v llama2 llava multi-modality multimodal nutrition nutrition-information stable-diffusion tranformers vision-language-learning vision-language-model
Last synced: 15 Nov 2024
https://github.com/rakeshkanneeswaran/luminaide_an_ai_integrated_ide
A personalized IDE developed from scratch, featuring AI integration with models like LLaMA, Mistral, and LLava. Includes a real-time terminal, intuitive file explorer, and Ace Editor to enhance productivity and streamline coding workflows.
express ide llama llama2 llava mistral react websocket
Last synced: 03 Dec 2024
https://github.com/fiqryq/caption-llava
A simple yet effective CLI application built on Node.js, using Ollama Vision LLava for auto generate caption based on your image.
Last synced: 14 Dec 2024
https://github.com/notyusheng/multimodal-large-language-model
Localized Multimodal Large Language Model (MLLM) integrated with Streamlit and Ollama for text and image processing tasks.
docker large-language-models llava llm multimodal multimodal-large-language-models ollama pretrained sphinx-doc
Last synced: 09 Nov 2024
https://github.com/robert-mcdermott/llm-image-classification
Image Classification Testing with LLMs
image-classification llava llm multimodal
Last synced: 30 Nov 2024
https://github.com/dsacms/mural-ollama
Multimodal LLM Mural Assistant with Ollama
ai llava llm llms multimodal mural ollama open-source pyqt6 python
Last synced: 10 Oct 2024
https://github.com/yuanze-lin/olympus
The official code for "Olympus: A Universal Task Router for Computer Vision Tasks"
chatbot chatgpt deeplearning foundation-models instruction-tuning llava llms mllms multi-modality multimodal pytorch vision-language-model
Last synced: 14 Dec 2024
https://github.com/hasnain3142/ai-adsense
AI AdSense is a cutting-edge application designed to detect human presence, analyze their features, and generate personalized advertisements using advanced AI technologies.
Last synced: 17 Nov 2024
https://github.com/codecaine-zz/ollama_llava_image_analysis
ollama llava image detection
ajax html image-classification image-recognition javascript llava llm ollama ollama-client
Last synced: 25 Nov 2024
https://github.com/nicholasgriffintn/llm-assistant
This is a side project to build an assistant using llms.
cloudflare cloudflare-ai fastapi langchain llava llm mistral mistral-nano ollama
Last synced: 02 Dec 2024
https://github.com/mapluisch/llava-websocket-server
Python-based WebSocket for CLI LLaVA inference.
inference llama llama2 llava llm llm-inference python websocket websockets
Last synced: 13 Nov 2024
https://github.com/runtime-error786/multimodaldocprocessor
huggingface huggingface-transformers langcahin llava multimodal
Last synced: 05 Dec 2024