An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with embeddings

A curated list of projects in awesome lists tagged with embeddings .

https://github.com/supabase/supabase

The open source Firebase alternative. Supabase gives you a dedicated Postgres database to build your web, mobile, and AI applications.

ai alternative auth database deno embeddings example firebase nextjs oauth2 pgvector postgis postgres postgresql postgrest realtime supabase vectors websockets

Last synced: 05 Feb 2026

https://github.com/thedotmack/claude-mem

A Claude Code plugin that automatically captures everything Claude does during your coding sessions, compresses it with AI (using Claude's agent-sdk), and injects relevant context back into future sessions.

ai ai-agents ai-memory anthropic artificial-intelligence chromadb claude claude-agent-sdk claude-agents claude-code claude-code-plugin claude-skills embeddings long-term-memory mem0 memory-engine openmemory rag sqlite supermemory

Last synced: 08 Jun 2026

https://github.com/chroma-core/chroma

the AI-native open-source embedding database

document-retrieval embeddings llms

Last synced: 05 May 2026

https://github.com/tencent/weknora

LLM-powered framework for deep document understanding, semantic retrieval, and context-aware answers using RAG paradigm.

agent agentic ai chatbot chatbots embeddings evaluation generative-ai golang knowledge-base llm multi-tenant multimodel ollama openai question-answering rag reranking semantic-search vector-search

Last synced: 15 Apr 2026

https://github.com/embedding/chinese-word-vectors

100+ Chinese Word Vectors 上百种预训练中文词向量

chinese chinese-word-segmentation embedding embeddings vectors-trained word-embeddings

Last synced: 10 Apr 2025

https://github.com/Embedding/Chinese-Word-Vectors

100+ Chinese Word Vectors 上百种预训练中文词向量

chinese chinese-word-segmentation embedding embeddings vectors-trained word-embeddings

Last synced: 26 Mar 2025

https://github.com/langchain4j/langchain4j

LangChain4j is an idiomatic, open-source Java library for building LLM-powered applications on the JVM. It offers a unified API over popular LLM providers and vector stores, and makes implementing tool calling (including MCP support), agents and RAG easy. It integrates seamlessly with enterprise Java frameworks like Quarkus and Spring Boot.

anthropic chatgpt chroma embeddings gemini gpt huggingface java langchain llama llm llms milvus ollama onnx openai openai-api pgvector pinecone vector-database

Last synced: 30 Apr 2026

https://github.com/h2oai/h2ogpt

Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://gpt-docs.h2o.ai/

ai chatgpt embeddings generative gpt gpt4all llama2 llm mixtral pdf private privategpt vectorstore

Last synced: 13 May 2025

https://github.com/kevinmusgrave/pytorch-metric-learning

The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.

computer-vision contrastive-learning deep-learning deep-metric-learning embeddings image-retrieval machine-learning metric-learning pytorch self-supervised-learning

Last synced: 12 May 2025

https://github.com/KevinMusgrave/pytorch-metric-learning

The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.

computer-vision contrastive-learning deep-learning deep-metric-learning embeddings image-retrieval machine-learning metric-learning pytorch self-supervised-learning

Last synced: 09 Apr 2025

https://github.com/shibing624/text2vec

text2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型,开箱即用。

embeddings nlp sentence-embeddings similarity text-similarity text2vec word2vec

Last synced: 12 May 2025

https://github.com/marker-inc-korea/autorag

AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation

analysis automl benchmarking document-parser embeddings evaluation llm llm-evaluation llm-ops open-source ops optimization pipeline python qa rag rag-evaluation retrieval-augmented-generation

Last synced: 03 Apr 2026

https://github.com/brianpetro/obsidian-smart-connections

Chat with your notes & see links to related content with AI embeddings. Use local models or 100+ via APIs like Claude, Gemini, ChatGPT & Llama 3

chatgpt claude embeddings gemini llama3 obsidian obsidian-plugin

Last synced: 14 May 2025

https://github.com/tensorflow/hub

A library for transfer learning by reusing parts of TensorFlow models.

embeddings image-classification machine-learning ml python tensorflow transfer-learning

Last synced: 12 May 2025

https://github.com/RyanCodrai/turbovec

A vector index built on TurboQuant, written in Rust with Python bindings

ann avx512 embedding embeddings faiss nearest-neighbor neon python quant quantization rag rust simd turboquant vector-search

Last synced: 02 Jun 2026

https://github.com/huggingface/text-embeddings-inference

A blazing fast inference solution for text embeddings models

ai embeddings huggingface llm ml

Last synced: 25 Feb 2026

https://github.com/hegelai/prompttools

Open-source tools for prompt testing and experimentation, with support for both LLMs (e.g. OpenAI, LLaMA) and vector databases (e.g. Chroma, Weaviate, LanceDB).

deep-learning developer-tools embeddings large-language-models llms machine-learning prompt-engineering python vector-search

Last synced: 14 May 2025

https://github.com/eugeneyan/ml-surveys

📋 Survey papers summarizing advances in deep learning, NLP, CV, graphs, reinforcement learning, recommendations, graphs, etc.

computer-vision deep-learning embeddings machine-learning nlp recommender-system reinforcement-learning survey

Last synced: 23 Mar 2025

https://github.com/samuraigpt/embedai

An app to interact privately with your documents using the power of GPT, 100% privately, no data leaks

chatbot chatgpt embedai embeddings generative gpt gpt4 gpt4all langchain models openai privategpt vectorstore whisper

Last synced: 05 Oct 2025

https://github.com/SamurAIGPT/EmbedAI

An app to interact privately with your documents using the power of GPT, 100% privately, no data leaks

chatbot chatgpt embedai embeddings generative gpt gpt4 gpt4all langchain models openai privategpt vectorstore whisper

Last synced: 14 Mar 2025

https://github.com/iterative/datachain

ETL, Analytics, Versioning for Unstructured Data

ai cv data-analytics data-wrangling embeddings llm llm-eval machine-learning mlops multimodal

Last synced: 18 Jun 2025

https://github.com/crmne/ruby_llm

Stop juggling AI SDKs! RubyLLM offers one delightful Ruby interface for OpenAI, Anthropic, Gemini, Bedrock, OpenRouter, DeepSeek, Ollama & compatible APIs. Chat, Vision, Audio, PDF, Images, Embeddings, Tools, Streaming & Rails integration.

ai anthropic chatgpt claude dall-e deepseek embeddings gemini image-generation llm openai rails ruby

Last synced: 02 Apr 2026

https://github.com/google/generative-ai-docs

Documentation for Google's Gen AI site - including the Gemini API and Gemma

ai chatbot documentation embeddings gemini gemini-api gemma llm machine-learning

Last synced: 14 May 2025

https://github.com/agentset-ai/agentset

The open-source RAG platform: built-in citations, deep research, 22+ file formats, partitions, MCP server, and more.

agentic-rag ai ai-agents ai-sdk chatbots embeddings genai llms memory memory-management rag vercel-ai-sdk

Last synced: 10 Mar 2026

https://github.com/featureform/featureform

The Virtual Feature Store. Turn your existing data infrastructure into a feature store.

data-quality data-science embeddings embeddings-similarity feature-engineering feature-store hacktoberfest machine-learning ml mlops python vector-database

Last synced: 14 Dec 2025

https://github.com/lilianweng/stock-rnn

Predict stock market prices using RNN model with multilayer LSTM cells + optional multi-stock embeddings.

embeddings lstm rnn-tensorflow stock-price-prediction

Last synced: 15 May 2025

https://github.com/kav-k/gptdiscord

A robust, all-in-one GPT interface for Discord. ChatGPT-style conversations, image generation, AI-moderation, custom indexes/knowledgebase, youtube summarizer, and more!

artificial-intelligence asyncio chatbot code-interpreter collaborate dalle2 digitalocean discord embeddings extractive-question-answering github gpt3 hacktoberfest help-wanted moderator-bot multi-modal openai openai-api pinecone python

Last synced: 07 Feb 2026

https://github.com/Kav-K/GPTDiscord

A robust, all-in-one GPT interface for Discord. ChatGPT-style conversations, image generation, AI-moderation, custom indexes/knowledgebase, youtube summarizer, and more!

artificial-intelligence asyncio chatbot code-interpreter collaborate dalle2 digitalocean discord embeddings extractive-question-answering github gpt3 hacktoberfest help-wanted moderator-bot multi-modal openai openai-api pinecone python

Last synced: 24 Mar 2025

https://github.com/yongzhuo/keras-textclassification

中文长文本分类、短句子分类、多标签分类、两句子相似度(Chinese Text Classification of Keras NLP, multi-label classify, or sentence classify, long or short),字词句向量嵌入层(embeddings)和网络层(graph)构建基类,FastText,TextCNN,CharCNN,TextRNN, RCNN, DCNN, DPCNN, VDCNN, CRNN, Bert, Xlnet, Albert, Attention, DeepMoji, HAN, 胶囊网络-CapsuleNet, Transformer-encode, Seq2seq, SWEM, LEAM, TextGCN

albert bert capsule charcnn crnn dcnn dpcnn embeddings fasttext han keras keras-textclassification leam nlp rcnn text-classification textcnn transformer vdcnn xlnet

Last synced: 15 May 2025

https://github.com/mintplex-labs/vector-admin

The universal tool suite for vector database management. Manage Pinecone, Chroma, Qdrant, Weaviate and more vector databases with ease.

ai ai-agents aitools chroma database-management document-retrieval embeddings flowise langchain langchain-js llms pinecone qdrant vector-data-management vector-database vector-database-embedding vector-search vectordatabase vectorspace weaviate

Last synced: 31 Oct 2025

https://github.com/Mintplex-Labs/vector-admin

The universal tool suite for vector database management. Manage Pinecone, Chroma, Qdrant, Weaviate and more vector databases with ease.

ai ai-agents aitools chroma database-management document-retrieval embeddings flowise langchain langchain-js llms pinecone qdrant vector-data-management vector-database vector-database-embedding vector-search vectordatabase vectorspace weaviate

Last synced: 24 Mar 2025

https://github.com/nomic-ai/nomic

Interact, analyze and structure massive text, image, embedding, audio and video datasets

clustering duplicate-detection embeddings python text topic-modeling unstructured-data

Last synced: 13 May 2025

https://github.com/qdrant/fastembed

Fast, Accurate, Lightweight Python library to make State of the Art Embedding

embeddings openai rag retrieval retrieval-augmented-generation vector-search

Last synced: 26 Mar 2025

https://github.com/superlinked/superlinked

Superlinked is a Python framework for AI Engineers building high-performance search & recommendation applications that combine structured and unstructured data.

data-pipeline deep-learning embeddings etl information-retrieval llm ml mlops natural-language-processing nlp python retrieval retrieval-augmented-generation semantic-search vector-database vector-search vectorization

Last synced: 16 Jan 2026

https://github.com/postgresml/korvus

Korvus is a search SDK that unifies the entire RAG pipeline in a single database query. Built on top of Postgres with bindings for Python, JavaScript, Rust and C.

ai embeddings javascript llm ml python rag search sql

Last synced: 14 May 2025

https://github.com/giancarloerra/socraticode

Enterprise-grade (40m+ LOC) codebase intelligence, zero-setup, private & local Plugin/Skill or MCP: hybrid semantic search, polyglot dependency graphs, symbol-level impact analysis & call-flow, interactive HTML viewer, cross-project & branch-aware search, DB/API/infra knowledge. 61% less tokens, 84% fewer calls, 37x faster. Cloud in private beta.

ai ai-assistant ast claude claude-code code-graph codebase-intelligence context-engine docker embeddings gemini gemini-cli-extension mcp openai qdrant semantic semantic-search vector-database vector-embeddings vector-search

Last synced: 04 May 2026

https://github.com/eliorc/node2vec

Implementation of the node2vec algorithm.

deep-learning embeddings machine-learning-algorithms

Last synced: 14 May 2025

https://github.com/natasha/natasha

Solves basic Russian NLP tasks, API for lower level Natasha projects

embeddings morphology ner nlp python russian sentence-segmentation syntax tokenizer visualization

Last synced: 13 May 2025

https://github.com/jiran214/gpt-vup

GPT-vup BIliBili | 抖音 | AI | 虚拟主播

bilibili chatgpt douyin embeddings vtuber

Last synced: 08 Apr 2025

https://github.com/jiran214/GPT-vup

GPT-vup BIliBili | 抖音 | AI | 虚拟主播

bilibili chatgpt douyin embeddings vtuber

Last synced: 04 Apr 2025

https://github.com/bheinzerling/bpemb

Pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE)

embeddings multilingual natural-language-processing nlp subword-embeddings

Last synced: 13 Apr 2025

https://github.com/MilaNLProc/contextualized-topic-models

A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021.

bert embeddings multilingual-models multilingual-topic-models neural-topic-models nlp nlp-library nlp-machine-learning text-as-data topic-coherence topic-modeling transformer

Last synced: 03 Apr 2025

https://github.com/llphant/llphant

LLPhant - A comprehensive PHP Generative AI Framework using OpenAI GPT 4. Inspired by Langchain

agent autophp embeddings genai generative-ai gpt4 langchain laravel llamaindex openai php symfony vector-database

Last synced: 07 Mar 2026

https://github.com/skalskip/vlms-zero-to-hero

This series will take you on a journey from the fundamentals of NLP and Computer Vision to the cutting edge of Vision-Language Models.

bert-model clip computer-vision embeddings gpt gpt-2 lora natural-language-processing seq2seq vision-language-model word2vec

Last synced: 06 Oct 2025

https://github.com/aws-samples/amazon-bedrock-samples

This repository contains examples for customers to get started using the Amazon Bedrock Service. This contains examples for all available foundational models

amazon-bedrock amazon-titan bedrock embeddings generative-ai knowledge-base langchain rag

Last synced: 14 May 2025

https://github.com/veekaybee/what_are_embeddings

A deep dive into embeddings starting from fundamentals

embeddings machine-learning machine-learning-algorithms nlp-machine-learning

Last synced: 14 May 2025

https://github.com/Dicklesworthstone/swiss_army_llama

A FastAPI service for semantic text search using precomputed embeddings and advanced similarity measures, with built-in support for various file types through textract.

embedding-similarity embedding-vectors embeddings llama2 llamacpp semantic-search

Last synced: 09 Apr 2025

https://github.com/dicklesworthstone/swiss_army_llama

A FastAPI service for semantic text search using precomputed embeddings and advanced similarity measures, with built-in support for various file types through textract.

embedding-similarity embedding-vectors embeddings llama2 llamacpp semantic-search

Last synced: 15 May 2025

https://github.com/LLPhant/LLPhant

LLPhant - A comprehensive PHP Generative AI Framework using OpenAI GPT 4. Inspired by Langchain

agent autophp embeddings genai generative-ai gpt4 langchain laravel llamaindex openai php symfony vector-database

Last synced: 20 Sep 2025

https://github.com/wikipedia2vec/wikipedia2vec

A tool for learning vector representations of words and entities from Wikipedia

embeddings natural-language-processing nlp python text-classification wikipedia

Last synced: 07 Apr 2025

https://github.com/cosmosgl/graph

GPU-accelerated force graph layout and rendering

embeddings force graph network simulation visualization webgl

Last synced: 11 May 2025

https://github.com/epsilla-cloud/vectordb

Epsilla is a high performance Vector Database Management System. Try out hosted Epsilla at https://cloud.epsilla.com/

ai chatgpt data data-science database embeddings embeddings-similarity infrastructure llms machine-learning neural-network neural-search rag retrieval search-engine vector-database vector-search

Last synced: 15 May 2025

https://github.com/your-papa/obsidian-Smart2Brain

An Obsidian plugin to interact with your privacy focused AI-Assistant making your second brain even smarter!

ai chatgpt embeddings obsidian-md obsidian-plugin ollama rag

Last synced: 18 Jul 2025

https://github.com/Atome-FE/llama-node

Believe in AI democratization. llama for nodejs backed by llama-rs, llama.cpp and rwkv.cpp, work locally on your laptop CPU. support llama/alpaca/gpt4all/vicuna/rwkv model.

ai embeddings gpt langchain large-language-models llama llama-node llama-rs llamacpp llm napi napi-rs nodejs rwkv

Last synced: 14 Apr 2025

https://github.com/atome-fe/llama-node

Believe in AI democratization. llama for nodejs backed by llama-rs, llama.cpp and rwkv.cpp, work locally on your laptop CPU. support llama/alpaca/gpt4all/vicuna/rwkv model.

ai embeddings gpt langchain large-language-models llama llama-node llama-rs llamacpp llm napi napi-rs nodejs rwkv

Last synced: 30 Mar 2025

https://github.com/neumtry/neumai

Neum AI is a best-in-class framework to manage the creation and synchronization of vector embeddings at large scale.

ai chatgpt data data-engineering database embeddings etl llm llmops mlops ops pipeline python rag retrieval vector-database vectors

Last synced: 29 Oct 2025

https://github.com/your-papa/obsidian-smart2brain

An Obsidian plugin to interact with your privacy focused AI-Assistant making your second brain even smarter!

ai chatgpt embeddings obsidian-md obsidian-plugin ollama rag

Last synced: 15 May 2025

https://github.com/azure/azure-search-vector-samples

A repository of code samples for Vector search capabilities in Azure AI Search.

azure azurecognitivesearch embeddings vector vector-search

Last synced: 14 May 2025

https://github.com/NeumTry/NeumAI

Neum AI is a best-in-class framework to manage the creation and synchronization of vector embeddings at large scale.

ai chatgpt data data-engineering database embeddings etl llm llmops mlops ops pipeline python rag retrieval vector-database vectors

Last synced: 11 Apr 2025

https://github.com/curiosity-ai/catalyst

🚀 Catalyst is a C# Natural Language Processing library built for speed. Inspired by spaCy's design, it brings pre-trained models, out-of-the box support for training word and document embeddings, and flexible entity recognition models.

ai artificial-intelligence csharp embeddings machine-learning natural-language-processing natural-language-understanding nlp

Last synced: 17 Jan 2026

https://github.com/Azure/azure-search-vector-samples

A repository of code samples for Vector search capabilities in Azure AI Search.

azure azurecognitivesearch embeddings vector vector-search

Last synced: 25 Mar 2025

https://github.com/hayabhay/frogbase

Transform audio-visual content into navigable knowledge.

embeddings package python search semantic-search speech-to-text streamlit ui

Last synced: 30 Sep 2025

https://github.com/lancedb/vectordb-recipes

High quality resources & applications for LLMs, multi-modal models and VectorDBs

agents ai deep-learning embeddings fine-tuning gpt gpt-4-vision langchain llama-index llms machine-learning multimodal openai rag vector-database

Last synced: 17 Oct 2025

https://github.com/maartengr/polyfuzz

Fuzzy string matching, grouping, and evaluation.

bert edit-distance embeddings levenshtein-distance string-matching tf-idf

Last synced: 04 Oct 2025

https://github.com/henomis/lingoose

🪿 LinGoose is a Go framework for building awesome AI/LLM applications.

ai chatgpt embeddings go golang index llm openai pinecone pipeline prompt vector

Last synced: 15 May 2025

https://github.com/anush008/fastembed-rs

Rust library for vector embeddings and reranking. Inspired by qdrant/fastembed.

embeddings fastembed rag reranker reranking retrieval retrieval-augmented-generation vector-search

Last synced: 19 Feb 2026

https://github.com/dgarnitz/vectorflow

VectorFlow is a high volume vector embedding pipeline that ingests raw data, transforms it into vectors and writes it to a vector DB of your choice.

ai data-engineering embeddings machine-learning nlp vectors

Last synced: 14 Dec 2025

https://aws-samples.github.io/amazon-bedrock-samples/

This repository contains examples for customers to get started using the Amazon Bedrock Service. This contains examples for all available foundational models

amazon-bedrock amazon-titan bedrock embeddings generative-ai knowledge-base langchain rag

Last synced: 03 Sep 2025

https://github.com/jiegzhan/multi-class-text-classification-cnn-rnn

Classify Kaggle San Francisco Crime Description into 39 classes. Build the model with CNN, RNN (GRU and LSTM) and Word Embeddings on Tensorflow.

cnn embeddings kaggle lstm rnn tensorflow text-classification

Last synced: 16 May 2026