Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with embeddings

A curated list of projects in awesome lists tagged with embeddings .

https://github.com/supabase/supabase

The open source Firebase alternative. Supabase gives you a dedicated Postgres database to build your web, mobile, and AI applications.

ai alternative auth database deno embeddings example firebase nextjs oauth2 pgvector postgis postgres postgresql postgrest realtime supabase vectors websockets

Last synced: 16 Dec 2024

https://github.com/chroma-core/chroma

the AI-native open-source embedding database

document-retrieval embeddings llms

Last synced: 26 Oct 2024

https://github.com/embedding/chinese-word-vectors

100+ Chinese Word Vectors 上百种预训练中文词向量

chinese chinese-word-segmentation embedding embeddings vectors-trained word-embeddings

Last synced: 17 Dec 2024

https://github.com/Embedding/Chinese-Word-Vectors

100+ Chinese Word Vectors 上百种预训练中文词向量

chinese chinese-word-segmentation embedding embeddings vectors-trained word-embeddings

Last synced: 29 Oct 2024

https://github.com/h2oai/h2ogpt

Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://gpt-docs.h2o.ai/

ai chatgpt embeddings generative gpt gpt4all llama2 llm mixtral pdf private privategpt vectorstore

Last synced: 16 Dec 2024

https://github.com/kevinmusgrave/pytorch-metric-learning

The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.

computer-vision contrastive-learning deep-learning deep-metric-learning embeddings image-retrieval machine-learning metric-learning pytorch self-supervised-learning

Last synced: 16 Dec 2024

https://github.com/KevinMusgrave/pytorch-metric-learning

The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.

computer-vision contrastive-learning deep-learning deep-metric-learning embeddings image-retrieval machine-learning metric-learning pytorch self-supervised-learning

Last synced: 06 Nov 2024

https://github.com/shibing624/text2vec

text2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型,开箱即用。

embeddings nlp sentence-embeddings similarity text-similarity text2vec word2vec

Last synced: 17 Dec 2024

https://github.com/lancedb/lance

Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..

apache-arrow computer-vision data-analysis data-analytics data-centric data-format data-science dataops deep-learning duckdb embeddings llms machine-learning mlops python rust

Last synced: 16 Dec 2024

https://github.com/tensorflow/hub

A library for transfer learning by reusing parts of TensorFlow models.

embeddings image-classification machine-learning ml python tensorflow transfer-learning

Last synced: 16 Dec 2024

https://github.com/brianpetro/obsidian-smart-connections

Chat with your notes & see links to related content with AI embeddings. Use local models or 100+ via APIs like Claude, Gemini, ChatGPT & Llama 3

chatgpt claude embeddings gemini llama3 obsidian obsidian-plugin

Last synced: 18 Dec 2024

https://github.com/marker-inc-korea/autorag

AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation

analysis automl benchmarking document-parser embeddings evaluation llm llm-evaluation llm-ops open-source ops optimization pipeline python qa rag rag-evaluation retrieval-augmented-generation

Last synced: 17 Dec 2024

https://github.com/eugeneyan/ml-surveys

📋 Survey papers summarizing advances in deep learning, NLP, CV, graphs, reinforcement learning, recommendations, graphs, etc.

computer-vision deep-learning embeddings machine-learning nlp recommender-system reinforcement-learning survey

Last synced: 30 Nov 2024

https://github.com/samuraigpt/embedai

An app to interact privately with your documents using the power of GPT, 100% privately, no data leaks

chatbot chatgpt embedai embeddings generative gpt gpt4 gpt4all langchain models openai privategpt vectorstore whisper

Last synced: 20 Dec 2024

https://github.com/SamurAIGPT/EmbedAI

An app to interact privately with your documents using the power of GPT, 100% privately, no data leaks

chatbot chatgpt embedai embeddings generative gpt gpt4 gpt4all langchain models openai privategpt vectorstore whisper

Last synced: 25 Oct 2024

https://github.com/huggingface/text-embeddings-inference

A blazing fast inference solution for text embeddings models

ai embeddings huggingface llm ml

Last synced: 17 Dec 2024

https://github.com/hegelai/prompttools

Open-source tools for prompt testing and experimentation, with support for both LLMs (e.g. OpenAI, LLaMA) and vector databases (e.g. Chroma, Weaviate, LanceDB).

deep-learning developer-tools embeddings large-language-models llms machine-learning prompt-engineering python vector-search

Last synced: 17 Dec 2024

https://github.com/kav-k/gptdiscord

A robust, all-in-one GPT interface for Discord. ChatGPT-style conversations, image generation, AI-moderation, custom indexes/knowledgebase, youtube summarizer, and more!

artificial-intelligence asyncio chatbot code-interpreter collaborate dalle2 digitalocean discord embeddings extractive-question-answering github gpt3 hacktoberfest help-wanted moderator-bot multi-modal openai openai-api pinecone python

Last synced: 20 Dec 2024

https://github.com/lilianweng/stock-rnn

Predict stock market prices using RNN model with multilayer LSTM cells + optional multi-stock embeddings.

embeddings lstm rnn-tensorflow stock-price-prediction

Last synced: 21 Dec 2024

https://github.com/Kav-K/GPTDiscord

A robust, all-in-one GPT interface for Discord. ChatGPT-style conversations, image generation, AI-moderation, custom indexes/knowledgebase, youtube summarizer, and more!

artificial-intelligence asyncio chatbot code-interpreter collaborate dalle2 digitalocean discord embeddings extractive-question-answering github gpt3 hacktoberfest help-wanted moderator-bot multi-modal openai openai-api pinecone python

Last synced: 28 Oct 2024

https://github.com/featureform/featureform

The Virtual Feature Store. Turn your existing data infrastructure into a feature store.

data-quality data-science embeddings embeddings-similarity feature-engineering feature-store hacktoberfest machine-learning ml mlops python vector-database

Last synced: 18 Dec 2024

https://github.com/yongzhuo/keras-textclassification

中文长文本分类、短句子分类、多标签分类、两句子相似度(Chinese Text Classification of Keras NLP, multi-label classify, or sentence classify, long or short),字词句向量嵌入层(embeddings)和网络层(graph)构建基类,FastText,TextCNN,CharCNN,TextRNN, RCNN, DCNN, DPCNN, VDCNN, CRNN, Bert, Xlnet, Albert, Attention, DeepMoji, HAN, 胶囊网络-CapsuleNet, Transformer-encode, Seq2seq, SWEM, LEAM, TextGCN

albert bert capsule charcnn crnn dcnn dpcnn embeddings fasttext han keras keras-textclassification leam nlp rcnn text-classification textcnn transformer vdcnn xlnet

Last synced: 18 Dec 2024

https://github.com/google/generative-ai-docs

Documentation for Google's Gen AI site - including the Gemini API and Gemma

ai chatbot documentation embeddings gemini gemini-api gemma llm machine-learning

Last synced: 19 Dec 2024

https://github.com/mintplex-labs/vector-admin

The universal tool suite for vector database management. Manage Pinecone, Chroma, Qdrant, Weaviate and more vector databases with ease.

ai ai-agents aitools chroma database-management document-retrieval embeddings flowise langchain langchain-js llms pinecone qdrant vector-data-management vector-database vector-database-embedding vector-search vectordatabase vectorspace weaviate

Last synced: 19 Dec 2024

https://github.com/Mintplex-Labs/vector-admin

The universal tool suite for vector database management. Manage Pinecone, Chroma, Qdrant, Weaviate and more vector databases with ease.

ai ai-agents aitools chroma database-management document-retrieval embeddings flowise langchain langchain-js llms pinecone qdrant vector-data-management vector-database vector-database-embedding vector-search vectordatabase vectorspace weaviate

Last synced: 28 Oct 2024

https://github.com/nomic-ai/nomic

Interact, analyze and structure massive text, image, embedding, audio and video datasets

clustering duplicate-detection embeddings python text topic-modeling unstructured-data

Last synced: 07 Nov 2024

https://github.com/postgresml/korvus

Korvus is a search SDK that unifies the entire RAG pipeline in a single database query. Built on top of Postgres with bindings for Python, JavaScript, Rust and C.

ai embeddings javascript llm ml python rag search sql

Last synced: 19 Dec 2024

https://github.com/eliorc/node2vec

Implementation of the node2vec algorithm.

deep-learning embeddings machine-learning-algorithms

Last synced: 17 Dec 2024

https://github.com/natasha/natasha

Solves basic Russian NLP tasks, API for lower level Natasha projects

embeddings morphology ner nlp python russian sentence-segmentation syntax tokenizer visualization

Last synced: 17 Dec 2024

https://github.com/jiran214/gpt-vup

GPT-vup BIliBili | 抖音 | AI | 虚拟主播

bilibili chatgpt douyin embeddings vtuber

Last synced: 15 Dec 2024

https://github.com/jiran214/GPT-vup

GPT-vup BIliBili | 抖音 | AI | 虚拟主播

bilibili chatgpt douyin embeddings vtuber

Last synced: 05 Nov 2024

https://github.com/bheinzerling/bpemb

Pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE)

embeddings multilingual natural-language-processing nlp subword-embeddings

Last synced: 19 Dec 2024

https://github.com/MilaNLProc/contextualized-topic-models

A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021.

bert embeddings multilingual-models multilingual-topic-models neural-topic-models nlp nlp-library nlp-machine-learning text-as-data topic-coherence topic-modeling transformer

Last synced: 04 Nov 2024

https://github.com/dicklesworthstone/swiss_army_llama

A FastAPI service for semantic text search using precomputed embeddings and advanced similarity measures, with built-in support for various file types through textract.

embedding-similarity embedding-vectors embeddings llama2 llamacpp semantic-search

Last synced: 20 Dec 2024

https://github.com/iterative/datachain

AI-data warehouse to enrich, transform and analyze data from cloud storages

ai cv data-analytics data-wrangling embeddings llm llm-eval mlops multimodal

Last synced: 09 Nov 2024

https://github.com/Dicklesworthstone/swiss_army_llama

A FastAPI service for semantic text search using precomputed embeddings and advanced similarity measures, with built-in support for various file types through textract.

embedding-similarity embedding-vectors embeddings llama2 llamacpp semantic-search

Last synced: 06 Nov 2024

https://github.com/wikipedia2vec/wikipedia2vec

A tool for learning vector representations of words and entities from Wikipedia

embeddings natural-language-processing nlp python text-classification wikipedia

Last synced: 06 Nov 2024

https://github.com/theodo-group/llphant

LLPhant - A comprehensive PHP Generative AI Framework using OpenAI GPT 4. Inspired by Langchain

agent autophp embeddings genai generative-ai gpt4 langchain laravel llamaindex openai php symfony vector-database

Last synced: 18 Dec 2024

https://github.com/epsilla-cloud/vectordb

Epsilla is a high performance Vector Database Management System. Try out hosted Epsilla at https://cloud.epsilla.com/

ai chatgpt data data-science database embeddings embeddings-similarity infrastructure llms machine-learning neural-network neural-search rag retrieval search-engine vector-database vector-search

Last synced: 19 Dec 2024

https://github.com/Atome-FE/llama-node

Believe in AI democratization. llama for nodejs backed by llama-rs, llama.cpp and rwkv.cpp, work locally on your laptop CPU. support llama/alpaca/gpt4all/vicuna/rwkv model.

ai embeddings gpt langchain large-language-models llama llama-node llama-rs llamacpp llm napi napi-rs nodejs rwkv

Last synced: 08 Nov 2024

https://github.com/atome-fe/llama-node

Believe in AI democratization. llama for nodejs backed by llama-rs, llama.cpp and rwkv.cpp, work locally on your laptop CPU. support llama/alpaca/gpt4all/vicuna/rwkv model.

ai embeddings gpt langchain large-language-models llama llama-node llama-rs llamacpp llm napi napi-rs nodejs rwkv

Last synced: 01 Nov 2024

https://github.com/theodo-group/LLPhant

LLPhant - A comprehensive PHP Generative AI Framework using OpenAI GPT 4. Inspired by Langchain

agent autophp embeddings genai generative-ai gpt4 langchain laravel llamaindex openai php symfony vector-database

Last synced: 01 Nov 2024

https://github.com/neumtry/neumai

Neum AI is a best-in-class framework to manage the creation and synchronization of vector embeddings at large scale.

ai chatgpt data data-engineering database embeddings etl llm llmops mlops ops pipeline python rag retrieval vector-database vectors

Last synced: 09 Nov 2024

https://github.com/NeumTry/NeumAI

Neum AI is a best-in-class framework to manage the creation and synchronization of vector embeddings at large scale.

ai chatgpt data data-engineering database embeddings etl llm llmops mlops ops pipeline python rag retrieval vector-database vectors

Last synced: 07 Nov 2024

https://github.com/veekaybee/what_are_embeddings

A deep dive into embeddings starting from fundamentals

embeddings machine-learning machine-learning-algorithms nlp-machine-learning

Last synced: 30 Oct 2024

https://github.com/qdrant/fastembed

Fast, Accurate, Lightweight Python library to make State of the Art Embedding

embeddings openai rag retrieval retrieval-augmented-generation vector-search

Last synced: 30 Oct 2024

https://github.com/hayabhay/frogbase

Transform audio-visual content into navigable knowledge.

embeddings package python search semantic-search speech-to-text streamlit ui

Last synced: 26 Sep 2024

https://github.com/azure/azure-search-vector-samples

A repository of code samples for Vector search capabilities in Azure AI Search.

azure azurecognitivesearch embeddings vector vector-search

Last synced: 20 Dec 2024

https://github.com/Azure/azure-search-vector-samples

A repository of code samples for Vector search capabilities in Azure AI Search.

azure azurecognitivesearch embeddings vector vector-search

Last synced: 29 Oct 2024

https://github.com/maartengr/polyfuzz

Fuzzy string matching, grouping, and evaluation.

bert edit-distance embeddings levenshtein-distance string-matching tf-idf

Last synced: 20 Dec 2024

https://github.com/curiosity-ai/catalyst

🚀 Catalyst is a C# Natural Language Processing library built for speed. Inspired by spaCy's design, it brings pre-trained models, out-of-the box support for training word and document embeddings, and flexible entity recognition models.

ai artificial-intelligence csharp embeddings machine-learning natural-language-processing natural-language-understanding nlp

Last synced: 28 Oct 2024

https://github.com/your-papa/obsidian-smart2brain

An Obsidian plugin to interact with your privacy focused AI-Assistant making your second brain even smarter!

ai chatgpt embeddings obsidian-md obsidian-plugin ollama rag

Last synced: 20 Dec 2024

https://github.com/henomis/lingoose

🪿 LinGoose is a Go framework for building awesome AI/LLM applications.

ai chatgpt embeddings go golang index llm openai pinecone pipeline prompt vector

Last synced: 20 Dec 2024

https://github.com/dgarnitz/vectorflow

VectorFlow is a high volume vector embedding pipeline that ingests raw data, transforms it into vectors and writes it to a vector DB of your choice.

ai data-engineering embeddings machine-learning nlp vectors

Last synced: 03 Sep 2024

https://github.com/lancedb/vectordb-recipes

High quality resources & applications for LLMs, multi-modal models and VectorDBs

agents ai deep-learning embeddings fine-tuning gpt gpt-4-vision langchain llama-index llms machine-learning multimodal openai rag vector-database

Last synced: 14 Dec 2024

https://github.com/your-papa/obsidian-Smart2Brain

An Obsidian plugin to interact with your privacy focused AI-Assistant making your second brain even smarter!

ai chatgpt embeddings obsidian-md obsidian-plugin ollama rag

Last synced: 25 Nov 2024

https://github.com/adobe/NLP-Cube

Natural Language Processing Pipeline - Sentence Splitting, Tokenization, Lemmatization, Part-of-speech Tagging and Dependency Parsing

dependency-parser dependency-parsing embeddings information-extraction language-pipeline lemmatization machine-translation nlp-cube parse part-of-speech-tagger sentence-splitting tokenization universal-dependencies

Last synced: 30 Oct 2024

https://github.com/relevanceai/vectorhub

Vector Hub - Library for easy discovery, and consumption of State-of-the-art models to turn data into vectors. (text2vec, image2vec, video2vec, graph2vec, bert, inception, etc)

artificial-intelligence audio-processing deep-learning deeplearning embeddings encodings image2vec machine-learning neural-network python pytorch tensorflow tfhub transformers vector vector-similarity video-processing word2vec

Last synced: 21 Dec 2024

https://github.com/RelevanceAI/vectorhub

Vector Hub - Library for easy discovery, and consumption of State-of-the-art models to turn data into vectors. (text2vec, image2vec, video2vec, graph2vec, bert, inception, etc)

artificial-intelligence audio-processing deep-learning deeplearning embeddings encodings image2vec machine-learning neural-network python pytorch tensorflow tfhub transformers vector vector-similarity video-processing word2vec

Last synced: 11 Nov 2024

https://github.com/Synerise/cleora

Cleora AI is a general-purpose open-source model for efficient, scalable learning of stable and inductive entity embeddings for heterogeneous relational data. Created by Synerise.com team.

ai cleora-embeddings datasets deepwalk embeddings entity graphs hypergraphs inductive-entity-embeddings machine-learning ml pytorch-biggraph synerise

Last synced: 14 Dec 2024

https://github.com/BaseModelAI/cleora

Cleora AI is a general-purpose model for efficient, scalable learning of stable and inductive entity embeddings for heterogeneous relational data.

ai cleora-embeddings datasets deepwalk embeddings entity graphs hypergraphs inductive-entity-embeddings machine-learning ml pytorch-biggraph synerise

Last synced: 13 Nov 2024

https://github.com/koaning/whatlies

Toolkit to help understand "what lies" in word embeddings. Also benchmarking!

embeddings nlp visualisations

Last synced: 29 Oct 2024

https://github.com/towhee-io/examples

Analyze the unstructured data with Towhee, such as reverse image search, reverse video search, audio classification, question and answer systems, molecular search, etc.

audio-classification cross-modal embeddings image-classification machine-learning nlp video-tagging

Last synced: 21 Dec 2024

https://github.com/wpydcr/LLM-Kit

🚀WebUI integrated platform for latest LLMs | 各大语言模型的全流程工具 WebUI 整合包。支持主流大模型API接口和开源模型。支持知识库,数据库,角色扮演,mj文生图,LoRA和全参数微调,数据集制作,live2d等全流程应用工具

chatbot embeddings fine-tuning generative-agents llm player

Last synced: 07 Nov 2024

https://github.com/ThoughtRiver/lmdb-embeddings

Fast word vectors with little memory usage in Python

embeddings fasttext gensim glove lmdb magnitude memory speed text vectors word word2vec

Last synced: 27 Nov 2024