An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with embeddings

A curated list of projects in awesome lists tagged with embeddings .

https://github.com/supabase/supabase

The open source Firebase alternative. Supabase gives you a dedicated Postgres database to build your web, mobile, and AI applications.

ai alternative auth database deno embeddings example firebase nextjs oauth2 pgvector postgis postgres postgresql postgrest realtime supabase vectors websockets

Last synced: 12 May 2025

https://github.com/mem0ai/mem0

Memory for AI Agents; SOTA in AI Agent Memory, beating OpenAI Memory in accuracy by 26% - https://mem0.ai/research

agent ai aiagent application chatbots chatgpt embeddings llm long-term-memory memory memory-management python rag state-management vector-database

Last synced: 12 May 2025

https://github.com/chroma-core/chroma

the AI-native open-source embedding database

document-retrieval embeddings llms

Last synced: 12 May 2025

https://github.com/embedding/chinese-word-vectors

100+ Chinese Word Vectors 上百种预训练中文词向量

chinese chinese-word-segmentation embedding embeddings vectors-trained word-embeddings

Last synced: 10 Apr 2025

https://github.com/Embedding/Chinese-Word-Vectors

100+ Chinese Word Vectors 上百种预训练中文词向量

chinese chinese-word-segmentation embedding embeddings vectors-trained word-embeddings

Last synced: 26 Mar 2025

https://github.com/h2oai/h2ogpt

Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://gpt-docs.h2o.ai/

ai chatgpt embeddings generative gpt gpt4all llama2 llm mixtral pdf private privategpt vectorstore

Last synced: 13 May 2025

https://github.com/langchain4j/langchain4j

LangChain4j is an open-source Java library that simplifies the integration of LLMs into Java applications through a unified API, providing access to popular LLMs and vector databases. It makes implementing RAG, tool calling (including support for MCP), and agents easy. LangChain4j integrates seamlessly with various enterprise Java frameworks.

anthropic chatgpt chroma embeddings gemini gpt huggingface java langchain llama llm llms milvus ollama onnx openai openai-api pgvector pinecone vector-database

Last synced: 08 Oct 2025

https://github.com/kevinmusgrave/pytorch-metric-learning

The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.

computer-vision contrastive-learning deep-learning deep-metric-learning embeddings image-retrieval machine-learning metric-learning pytorch self-supervised-learning

Last synced: 12 May 2025

https://github.com/KevinMusgrave/pytorch-metric-learning

The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.

computer-vision contrastive-learning deep-learning deep-metric-learning embeddings image-retrieval machine-learning metric-learning pytorch self-supervised-learning

Last synced: 09 Apr 2025

https://github.com/shibing624/text2vec

text2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型,开箱即用。

embeddings nlp sentence-embeddings similarity text-similarity text2vec word2vec

Last synced: 12 May 2025

https://github.com/lancedb/lance

Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..

apache-arrow computer-vision data-analysis data-analytics data-centric data-format data-science dataops deep-learning duckdb embeddings llms machine-learning mlops python rust

Last synced: 05 May 2025

https://github.com/marker-inc-korea/autorag

AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation

analysis automl benchmarking document-parser embeddings evaluation llm llm-evaluation llm-ops open-source ops optimization pipeline python qa rag rag-evaluation retrieval-augmented-generation

Last synced: 12 May 2025

https://github.com/brianpetro/obsidian-smart-connections

Chat with your notes & see links to related content with AI embeddings. Use local models or 100+ via APIs like Claude, Gemini, ChatGPT & Llama 3

chatgpt claude embeddings gemini llama3 obsidian obsidian-plugin

Last synced: 14 May 2025

https://github.com/tensorflow/hub

A library for transfer learning by reusing parts of TensorFlow models.

embeddings image-classification machine-learning ml python tensorflow transfer-learning

Last synced: 12 May 2025

https://github.com/huggingface/text-embeddings-inference

A blazing fast inference solution for text embeddings models

ai embeddings huggingface llm ml

Last synced: 13 May 2025

https://github.com/hegelai/prompttools

Open-source tools for prompt testing and experimentation, with support for both LLMs (e.g. OpenAI, LLaMA) and vector databases (e.g. Chroma, Weaviate, LanceDB).

deep-learning developer-tools embeddings large-language-models llms machine-learning prompt-engineering python vector-search

Last synced: 14 May 2025

https://github.com/eugeneyan/ml-surveys

📋 Survey papers summarizing advances in deep learning, NLP, CV, graphs, reinforcement learning, recommendations, graphs, etc.

computer-vision deep-learning embeddings machine-learning nlp recommender-system reinforcement-learning survey

Last synced: 23 Mar 2025

https://github.com/samuraigpt/embedai

An app to interact privately with your documents using the power of GPT, 100% privately, no data leaks

chatbot chatgpt embedai embeddings generative gpt gpt4 gpt4all langchain models openai privategpt vectorstore whisper

Last synced: 05 Oct 2025

https://github.com/SamurAIGPT/EmbedAI

An app to interact privately with your documents using the power of GPT, 100% privately, no data leaks

chatbot chatgpt embedai embeddings generative gpt gpt4 gpt4all langchain models openai privategpt vectorstore whisper

Last synced: 14 Mar 2025

https://github.com/iterative/datachain

ETL, Analytics, Versioning for Unstructured Data

ai cv data-analytics data-wrangling embeddings llm llm-eval machine-learning mlops multimodal

Last synced: 18 Jun 2025

https://github.com/crmne/ruby_llm

Stop juggling AI SDKs! RubyLLM offers one delightful Ruby interface for OpenAI, Anthropic, Gemini, Bedrock, OpenRouter, DeepSeek, Ollama & compatible APIs. Chat, Vision, Audio, PDF, Images, Embeddings, Tools, Streaming & Rails integration.

ai anthropic chatgpt claude dall-e deepseek embeddings gemini image-generation llm openai rails ruby

Last synced: 13 May 2025

https://github.com/google/generative-ai-docs

Documentation for Google's Gen AI site - including the Gemini API and Gemma

ai chatbot documentation embeddings gemini gemini-api gemma llm machine-learning

Last synced: 14 May 2025

https://github.com/featureform/featureform

The Virtual Feature Store. Turn your existing data infrastructure into a feature store.

data-quality data-science embeddings embeddings-similarity feature-engineering feature-store hacktoberfest machine-learning ml mlops python vector-database

Last synced: 14 Dec 2025

https://github.com/lilianweng/stock-rnn

Predict stock market prices using RNN model with multilayer LSTM cells + optional multi-stock embeddings.

embeddings lstm rnn-tensorflow stock-price-prediction

Last synced: 15 May 2025

https://github.com/Kav-K/GPTDiscord

A robust, all-in-one GPT interface for Discord. ChatGPT-style conversations, image generation, AI-moderation, custom indexes/knowledgebase, youtube summarizer, and more!

artificial-intelligence asyncio chatbot code-interpreter collaborate dalle2 digitalocean discord embeddings extractive-question-answering github gpt3 hacktoberfest help-wanted moderator-bot multi-modal openai openai-api pinecone python

Last synced: 24 Mar 2025

https://github.com/kav-k/gptdiscord

A robust, all-in-one GPT interface for Discord. ChatGPT-style conversations, image generation, AI-moderation, custom indexes/knowledgebase, youtube summarizer, and more!

artificial-intelligence asyncio chatbot code-interpreter collaborate dalle2 digitalocean discord embeddings extractive-question-answering github gpt3 hacktoberfest help-wanted moderator-bot multi-modal openai openai-api pinecone python

Last synced: 15 May 2025

https://github.com/yongzhuo/keras-textclassification

中文长文本分类、短句子分类、多标签分类、两句子相似度(Chinese Text Classification of Keras NLP, multi-label classify, or sentence classify, long or short),字词句向量嵌入层(embeddings)和网络层(graph)构建基类,FastText,TextCNN,CharCNN,TextRNN, RCNN, DCNN, DPCNN, VDCNN, CRNN, Bert, Xlnet, Albert, Attention, DeepMoji, HAN, 胶囊网络-CapsuleNet, Transformer-encode, Seq2seq, SWEM, LEAM, TextGCN

albert bert capsule charcnn crnn dcnn dpcnn embeddings fasttext han keras keras-textclassification leam nlp rcnn text-classification textcnn transformer vdcnn xlnet

Last synced: 15 May 2025

https://github.com/mintplex-labs/vector-admin

The universal tool suite for vector database management. Manage Pinecone, Chroma, Qdrant, Weaviate and more vector databases with ease.

ai ai-agents aitools chroma database-management document-retrieval embeddings flowise langchain langchain-js llms pinecone qdrant vector-data-management vector-database vector-database-embedding vector-search vectordatabase vectorspace weaviate

Last synced: 31 Oct 2025

https://github.com/Mintplex-Labs/vector-admin

The universal tool suite for vector database management. Manage Pinecone, Chroma, Qdrant, Weaviate and more vector databases with ease.

ai ai-agents aitools chroma database-management document-retrieval embeddings flowise langchain langchain-js llms pinecone qdrant vector-data-management vector-database vector-database-embedding vector-search vectordatabase vectorspace weaviate

Last synced: 24 Mar 2025

https://github.com/nomic-ai/nomic

Interact, analyze and structure massive text, image, embedding, audio and video datasets

clustering duplicate-detection embeddings python text topic-modeling unstructured-data

Last synced: 13 May 2025

https://github.com/qdrant/fastembed

Fast, Accurate, Lightweight Python library to make State of the Art Embedding

embeddings openai rag retrieval retrieval-augmented-generation vector-search

Last synced: 26 Mar 2025

https://github.com/postgresml/korvus

Korvus is a search SDK that unifies the entire RAG pipeline in a single database query. Built on top of Postgres with bindings for Python, JavaScript, Rust and C.

ai embeddings javascript llm ml python rag search sql

Last synced: 14 May 2025

https://github.com/eliorc/node2vec

Implementation of the node2vec algorithm.

deep-learning embeddings machine-learning-algorithms

Last synced: 14 May 2025

https://github.com/natasha/natasha

Solves basic Russian NLP tasks, API for lower level Natasha projects

embeddings morphology ner nlp python russian sentence-segmentation syntax tokenizer visualization

Last synced: 13 May 2025

https://github.com/jiran214/GPT-vup

GPT-vup BIliBili | 抖音 | AI | 虚拟主播

bilibili chatgpt douyin embeddings vtuber

Last synced: 04 Apr 2025

https://github.com/jiran214/gpt-vup

GPT-vup BIliBili | 抖音 | AI | 虚拟主播

bilibili chatgpt douyin embeddings vtuber

Last synced: 08 Apr 2025

https://github.com/bheinzerling/bpemb

Pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE)

embeddings multilingual natural-language-processing nlp subword-embeddings

Last synced: 13 Apr 2025

https://github.com/MilaNLProc/contextualized-topic-models

A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021.

bert embeddings multilingual-models multilingual-topic-models neural-topic-models nlp nlp-library nlp-machine-learning text-as-data topic-coherence topic-modeling transformer

Last synced: 03 Apr 2025

https://github.com/llphant/llphant

LLPhant - A comprehensive PHP Generative AI Framework using OpenAI GPT 4. Inspired by Langchain

agent autophp embeddings genai generative-ai gpt4 langchain laravel llamaindex openai php symfony vector-database

Last synced: 08 Oct 2025

https://github.com/skalskip/vlms-zero-to-hero

This series will take you on a journey from the fundamentals of NLP and Computer Vision to the cutting edge of Vision-Language Models.

bert-model clip computer-vision embeddings gpt gpt-2 lora natural-language-processing seq2seq vision-language-model word2vec

Last synced: 06 Oct 2025

https://github.com/aws-samples/amazon-bedrock-samples

This repository contains examples for customers to get started using the Amazon Bedrock Service. This contains examples for all available foundational models

amazon-bedrock amazon-titan bedrock embeddings generative-ai knowledge-base langchain rag

Last synced: 14 May 2025

https://github.com/veekaybee/what_are_embeddings

A deep dive into embeddings starting from fundamentals

embeddings machine-learning machine-learning-algorithms nlp-machine-learning

Last synced: 14 May 2025

https://github.com/Dicklesworthstone/swiss_army_llama

A FastAPI service for semantic text search using precomputed embeddings and advanced similarity measures, with built-in support for various file types through textract.

embedding-similarity embedding-vectors embeddings llama2 llamacpp semantic-search

Last synced: 09 Apr 2025

https://github.com/dicklesworthstone/swiss_army_llama

A FastAPI service for semantic text search using precomputed embeddings and advanced similarity measures, with built-in support for various file types through textract.

embedding-similarity embedding-vectors embeddings llama2 llamacpp semantic-search

Last synced: 15 May 2025

https://github.com/LLPhant/LLPhant

LLPhant - A comprehensive PHP Generative AI Framework using OpenAI GPT 4. Inspired by Langchain

agent autophp embeddings genai generative-ai gpt4 langchain laravel llamaindex openai php symfony vector-database

Last synced: 20 Sep 2025

https://github.com/wikipedia2vec/wikipedia2vec

A tool for learning vector representations of words and entities from Wikipedia

embeddings natural-language-processing nlp python text-classification wikipedia

Last synced: 07 Apr 2025

https://github.com/cosmosgl/graph

GPU-accelerated force graph layout and rendering

embeddings force graph network simulation visualization webgl

Last synced: 11 May 2025

https://github.com/epsilla-cloud/vectordb

Epsilla is a high performance Vector Database Management System. Try out hosted Epsilla at https://cloud.epsilla.com/

ai chatgpt data data-science database embeddings embeddings-similarity infrastructure llms machine-learning neural-network neural-search rag retrieval search-engine vector-database vector-search

Last synced: 15 May 2025

https://github.com/your-papa/obsidian-Smart2Brain

An Obsidian plugin to interact with your privacy focused AI-Assistant making your second brain even smarter!

ai chatgpt embeddings obsidian-md obsidian-plugin ollama rag

Last synced: 18 Jul 2025

https://github.com/Atome-FE/llama-node

Believe in AI democratization. llama for nodejs backed by llama-rs, llama.cpp and rwkv.cpp, work locally on your laptop CPU. support llama/alpaca/gpt4all/vicuna/rwkv model.

ai embeddings gpt langchain large-language-models llama llama-node llama-rs llamacpp llm napi napi-rs nodejs rwkv

Last synced: 14 Apr 2025

https://github.com/atome-fe/llama-node

Believe in AI democratization. llama for nodejs backed by llama-rs, llama.cpp and rwkv.cpp, work locally on your laptop CPU. support llama/alpaca/gpt4all/vicuna/rwkv model.

ai embeddings gpt langchain large-language-models llama llama-node llama-rs llamacpp llm napi napi-rs nodejs rwkv

Last synced: 30 Mar 2025

https://github.com/neumtry/neumai

Neum AI is a best-in-class framework to manage the creation and synchronization of vector embeddings at large scale.

ai chatgpt data data-engineering database embeddings etl llm llmops mlops ops pipeline python rag retrieval vector-database vectors

Last synced: 29 Oct 2025

https://github.com/your-papa/obsidian-smart2brain

An Obsidian plugin to interact with your privacy focused AI-Assistant making your second brain even smarter!

ai chatgpt embeddings obsidian-md obsidian-plugin ollama rag

Last synced: 15 May 2025

https://github.com/azure/azure-search-vector-samples

A repository of code samples for Vector search capabilities in Azure AI Search.

azure azurecognitivesearch embeddings vector vector-search

Last synced: 14 May 2025

https://github.com/NeumTry/NeumAI

Neum AI is a best-in-class framework to manage the creation and synchronization of vector embeddings at large scale.

ai chatgpt data data-engineering database embeddings etl llm llmops mlops ops pipeline python rag retrieval vector-database vectors

Last synced: 11 Apr 2025

https://github.com/Azure/azure-search-vector-samples

A repository of code samples for Vector search capabilities in Azure AI Search.

azure azurecognitivesearch embeddings vector vector-search

Last synced: 25 Mar 2025

https://github.com/superlinked/superlinked

A compute framework for building Search, RAG, Recommendations and Analytics over complex structured & unstructured data.

data-pipeline deep-learning embeddings etl information-retrieval llm ml mlops natural-language-processing nlp python retrieval retrieval-augmented-generation semantic-search vector-database vector-search vectorization

Last synced: 13 Mar 2025

https://github.com/hayabhay/frogbase

Transform audio-visual content into navigable knowledge.

embeddings package python search semantic-search speech-to-text streamlit ui

Last synced: 30 Sep 2025

https://github.com/lancedb/vectordb-recipes

High quality resources & applications for LLMs, multi-modal models and VectorDBs

agents ai deep-learning embeddings fine-tuning gpt gpt-4-vision langchain llama-index llms machine-learning multimodal openai rag vector-database

Last synced: 17 Oct 2025

https://github.com/maartengr/polyfuzz

Fuzzy string matching, grouping, and evaluation.

bert edit-distance embeddings levenshtein-distance string-matching tf-idf

Last synced: 04 Oct 2025

https://github.com/henomis/lingoose

🪿 LinGoose is a Go framework for building awesome AI/LLM applications.

ai chatgpt embeddings go golang index llm openai pinecone pipeline prompt vector

Last synced: 15 May 2025

https://github.com/curiosity-ai/catalyst

🚀 Catalyst is a C# Natural Language Processing library built for speed. Inspired by spaCy's design, it brings pre-trained models, out-of-the box support for training word and document embeddings, and flexible entity recognition models.

ai artificial-intelligence csharp embeddings machine-learning natural-language-processing natural-language-understanding nlp

Last synced: 20 Mar 2025

https://github.com/dgarnitz/vectorflow

VectorFlow is a high volume vector embedding pipeline that ingests raw data, transforms it into vectors and writes it to a vector DB of your choice.

ai data-engineering embeddings machine-learning nlp vectors

Last synced: 14 Dec 2025

https://aws-samples.github.io/amazon-bedrock-samples/

This repository contains examples for customers to get started using the Amazon Bedrock Service. This contains examples for all available foundational models

amazon-bedrock amazon-titan bedrock embeddings generative-ai knowledge-base langchain rag

Last synced: 03 Sep 2025

https://github.com/RelevanceAI/vectorhub

Vector Hub - Library for easy discovery, and consumption of State-of-the-art models to turn data into vectors. (text2vec, image2vec, video2vec, graph2vec, bert, inception, etc)

artificial-intelligence audio-processing deep-learning deeplearning embeddings encodings image2vec machine-learning neural-network python pytorch tensorflow tfhub transformers vector vector-similarity video-processing word2vec

Last synced: 27 Apr 2025

https://github.com/relevanceai/vectorhub

Vector Hub - Library for easy discovery, and consumption of State-of-the-art models to turn data into vectors. (text2vec, image2vec, video2vec, graph2vec, bert, inception, etc)

artificial-intelligence audio-processing deep-learning deeplearning embeddings encodings image2vec machine-learning neural-network python pytorch tensorflow tfhub transformers vector vector-similarity video-processing word2vec

Last synced: 04 Apr 2025

https://github.com/adobe/NLP-Cube

Natural Language Processing Pipeline - Sentence Splitting, Tokenization, Lemmatization, Part-of-speech Tagging and Dependency Parsing

dependency-parser dependency-parsing embeddings information-extraction language-pipeline lemmatization machine-translation nlp-cube parse part-of-speech-tagger sentence-splitting tokenization universal-dependencies

Last synced: 27 Mar 2025