An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with embedding

A curated list of projects in awesome lists tagged with embedding .

https://github.com/chatchat-space/langchain-chatchat

Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and Llama) RAG and Agent app with langchain

chatbot chatchat chatglm chatgpt embedding faiss fastchat gpt knowledge-base langchain langchain-chatglm llama llm milvus ollama qwen rag retrieval-augmented-generation streamlit xinference

Last synced: 12 May 2025

https://github.com/chatchat-space/Langchain-Chatchat

Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and Llama) RAG and Agent app with langchain

chatbot chatchat chatglm chatgpt embedding faiss fastchat gpt knowledge-base langchain langchain-chatglm llama llm milvus ollama qwen rag retrieval-augmented-generation streamlit xinference

Last synced: 24 Mar 2025

https://github.com/PaddlePaddle/PaddleNLP

👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.

bert compression distributed-training document-intelligence embedding ernie information-extraction llama llm neural-search nlp paddlenlp pretrained-models question-answering search-engine semantic-analysis sentiment-analysis transformers uie

Last synced: 18 Mar 2025

https://github.com/embedding/chinese-word-vectors

100+ Chinese Word Vectors 上百种预训练中文词向量

chinese chinese-word-segmentation embedding embeddings vectors-trained word-embeddings

Last synced: 10 Apr 2025

https://github.com/Embedding/Chinese-Word-Vectors

100+ Chinese Word Vectors 上百种预训练中文词向量

chinese chinese-word-segmentation embedding embeddings vectors-trained word-embeddings

Last synced: 26 Mar 2025

https://github.com/modelscope/ms-swift

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, InternLM3, GLM4, Mistral, Yi1.5, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, Qwen2-Audio, Ovis2, InternVL3, Llava, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, DeepSeek-VL2, Phi4, GOT-OCR2, ...).

deepseek-r1 deploy embedding grpo internvl liger llama llama4 llm lora megatron multimodal omni open-r1 peft qwen2-vl qwen3 qwen3-moe rft sft

Last synced: 14 Feb 2026

https://github.com/madawei2699/myGPTReader

A community-driven way to read and chat with AI bots - powered by chatGPT.

ai chatgpt crawler daily-news embedding gpt-35-turbo hot-news openai prompt reader scraper slack-bot

Last synced: 16 Apr 2025

https://github.com/myreader-io/mygptreader

A community-driven way to read and chat with AI bots - powered by chatGPT.

ai chatgpt crawler daily-news embedding gpt-35-turbo hot-news openai prompt reader scraper slack-bot

Last synced: 14 May 2025

https://github.com/myreader-io/myGPTReader

A community-driven way to read and chat with AI bots - powered by chatGPT.

ai chatgpt crawler daily-news embedding gpt-35-turbo hot-news openai prompt reader scraper slack-bot

Last synced: 10 Apr 2025

https://github.com/infiniflow/infinity

The AI-native database built for LLM applications, providing incredibly fast hybrid search of dense vector, sparse vector, tensor (multi-vector), and full-text

ai-native approximate-nearest-neighbor-search bm25 cpp20 cpp20-modules embedding full-text-search hnsw hybrid-search information-retrival nearest-neighbor-search rag search-engine tensor-database vector vector-database vector-search vectordatabase

Last synced: 12 May 2025

https://github.com/run-llama/llamaindexts

Data framework for your LLM applications. Focus on server side solution

agent chatbot claude-ai create-llama embedding groq-ai javascript llama llama-index llama3 llamaindex llm node nodejs openai react typescript

Last synced: 14 May 2025

https://github.com/groupultra/telegram-search

🔍 一个功能强大的 Telegram 聊天记录搜索工具,支持向量搜索和语义匹配。A powerful Telegram chat search tool with vector search and semantic matching capabilities.

embedding mcp telegram telegram-bot

Last synced: 26 Jan 2026

https://github.com/withcatai/node-llama-cpp

Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema on the model output on the generation level

ai bindings catai cmake cmake-js cuda embedding function-calling gguf gpu grammar json-schema llama llama-cpp llm metal nodejs prebuilt-binaries self-hosted vulkan

Last synced: 26 Jan 2026

https://github.com/pavlin-policar/opentsne

Extensible, parallel implementations of t-SNE

dimensionality-reduction embedding machine-learning tsne visualization

Last synced: 11 Dec 2025

https://github.com/pavlin-policar/openTSNE

Extensible, parallel implementations of t-SNE

dimensionality-reduction embedding machine-learning tsne visualization

Last synced: 12 Apr 2025

https://github.com/myscale/myscaledb

A @ClickHouse fork that supports high-performance vector search and full-text search.

ann big-data embedding image-search llm myscaledb rag search-engine similarity-search sql sql-vector unstructured-analytics vector-search vectordb

Last synced: 12 Jan 2026

https://github.com/myscale/MyScaleDB

A @ClickHouse fork that supports high-performance vector search and full-text search.

ann big-data embedding image-search llm myscaledb rag search-engine similarity-search sql sql-vector unstructured-analytics vector-search vectordb

Last synced: 12 Mar 2025

https://github.com/skywalkerdarren/chatweb

ChatWeb can crawl web pages, read PDF, DOCX, TXT, and extract the main content, then answer your questions based on the content, or summarize the key points.

ai chatgpt crawler docx embedding faiss gpt gpt-35-turbo news-extractor newspaper openai pdf pgvector postgresql vector-database

Last synced: 25 Oct 2025

https://github.com/SkywalkerDarren/chatWeb

ChatWeb can crawl web pages, read PDF, DOCX, TXT, and extract the main content, then answer your questions based on the content, or summarize the key points.

ai chatgpt crawler docx embedding faiss gpt gpt-35-turbo news-extractor newspaper openai pdf pgvector postgresql vector-database

Last synced: 30 Mar 2025

https://github.com/zhezhaoa/ngram2vec

Four word embedding models implemented in Python. Supporting arbitrary context features

analogy chinese embedding glove n-gram ngram ngram2vec ppmi svd word word-embedding word2vec

Last synced: 09 Apr 2025

https://github.com/GramSearch/telegram-search

🚀 一个功能强大的 Telegram 聊天记录搜索工具,支持向量搜索和语义匹配。

embedding telegram telegram-bot

Last synced: 02 May 2025

https://github.com/OpenBMB/UltraRAG

Build & Optimize your RAG.

embedding finetune llm rag

Last synced: 31 Mar 2025

https://github.com/OysterQAQ/ACG2vec

ACG2vec (Anime Comics Games to vector) are committed to creating a playground that combines ACG and Deep learning.(文本语义检索、以图搜图、语义搜图、图片超分辨率、推荐系统)

acg anime deep-learning embedding feature-extraction image-search image-super-resolution keras tensorflow

Last synced: 08 May 2025

https://github.com/marl/openl3

OpenL3: Open-source deep audio and image embeddings

audio deep-learning embedding embedding-models image image-embeddings machine-listening

Last synced: 16 May 2025

https://apple.github.io/embedding-atlas/

Embedding Atlas is a tool that provides interactive visualizations for large embeddings. It allows you to visualize, cross-filter, and search embeddings and metadata.

embedding visualization

Last synced: 11 Aug 2025

https://github.com/guangzhengli/vectorhub

Quickly and easily build AI website or application by using embeddings!

chatgpt chatpdf embedding embeddings gpt gpt-3 nextjs supabase vector vector-database

Last synced: 05 Apr 2025

https://github.com/askaitools/askaitools-community-edition

A cutting-edge search engine project tailored specifically for the AI product

ai embedding enterprise-search full-text-search hybrid-search search search-engine semantic-search tools

Last synced: 24 Mar 2025

https://github.com/PaddlePaddle/ERNIE-SDK

ERNIE Bot Agent is a Large Language Model (LLM) Agent Framework, powered by the advanced capabilities of ERNIE Bot and the platform resources of Baidu AI Studio.

agent chatcompletion embedding ernie-bot function-calling llm sdk

Last synced: 24 Mar 2025

https://github.com/paddlepaddle/ernie-sdk

ERNIE Bot Agent is a Large Language Model (LLM) Agent Framework, powered by the advanced capabilities of ERNIE Bot and the platform resources of Baidu AI Studio.

agent chatcompletion embedding ernie-bot function-calling llm sdk

Last synced: 12 Apr 2025

https://github.com/yongzhuo/macadam

Macadam是一个以Tensorflow(Keras)和bert4keras为基础,专注于文本分类、序列标注和关系抽取的自然语言处理工具包。支持RANDOM、WORD2VEC、FASTTEXT、BERT、ALBERT、ROBERTA、NEZHA、XLNET、ELECTRA、GPT-2等EMBEDDING嵌入; 支持FineTune、FastText、TextCNN、CharCNN、BiRNN、RCNN、DCNN、CRNN、DeepMoji、SelfAttention、HAN、Capsule等文本分类算法; 支持CRF、Bi-LSTM-CRF、CNN-LSTM、DGCNN、Bi-LSTM-LAN、Lattice-LSTM-Batch、MRC等序列标注算法。

bert embedding keras ner python3 relation-extraction sequence-labeling tensorflow text-classification

Last synced: 05 Apr 2025

https://github.com/marcominerva/chatgptnet

A ChatGPT integration library for .NET, supporting both OpenAI and Azure OpenAI Service

azure-openai azure-openai-api chatgpt csharp dotnet embedding embeddings embeddings-similarity hacktoberfest net openai openai-api

Last synced: 16 May 2025

https://github.com/marcominerva/ChatGptNet

A ChatGPT integration library for .NET, supporting both OpenAI and Azure OpenAI Service

azure-openai azure-openai-api chatgpt csharp dotnet embedding embeddings embeddings-similarity hacktoberfest net openai openai-api

Last synced: 20 Apr 2025

https://github.com/snap-stanford/kgreasoning

Multi-Hop Logical Reasoning in Knowledge Graphs

embedding knowledge-base knowledge-graph reasoning

Last synced: 28 Oct 2025

https://github.com/snap-stanford/KGReasoning

Multi-Hop Logical Reasoning in Knowledge Graphs

embedding knowledge-base knowledge-graph reasoning

Last synced: 12 Apr 2025

https://github.com/gramsearch/telegram-search

🚀 一个功能强大的 Telegram 聊天记录搜索工具,支持向量搜索和语义匹配。

embedding telegram telegram-bot

Last synced: 09 Apr 2025

https://github.com/opensolon/solon-ai

Java AI 应用开发框架(支持 LLM,RAG,MCP,Agent)。同时兼容 java8 ~ java25。也可嵌入到 SpringBoot、jFinal、Vert.x、Quarkus 等框架中使用。

ai chat deepseek embedding function-call java llm mcp-client mcp-server modelcontextprotocol openai rag reranking tool-call

Last synced: 19 Jan 2026

https://github.com/microsoft/rag-experiment-accelerator

The RAG Experiment Accelerator is a versatile tool designed to expedite and facilitate the process of conducting experiments and evaluations using Azure Cognitive Search and RAG pattern.

acs azure chunking dense embedding evaluation experiment genai indexing information-retrieval llm openai rag sparse vectors

Last synced: 16 May 2025

https://github.com/wzdavid/ThinkRAG

A LLM RAG system runs on your laptop. 大模型检索增强生成系统,可以轻松部署在笔记本电脑上,实现本地知识库智能问答。

baai chatbot chromadb deepseek elasticsearch embedding hybrid-search knowledgebase lancedb langchain laptop llamaindex moonshot ollama openai rag reranking retrieval-augmented-generation streamlit zhipuai

Last synced: 05 Oct 2025

https://github.com/tiger-ai-lab/vlm2vec

This repo contains the code for "VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks" [ICLR25]

benchmark contrastive-learning embedding image-retrieval mmeb multimodal rag representation-learning video-retrieval visual-document-retrieval vlm

Last synced: 13 Jun 2025

https://github.com/dilolabs/nosia

Self-hosted AI RAG + MCP Platform

ai all-in-one docker embedding llm mcp rag ruby ruby-on-rails shell

Last synced: 17 Feb 2026

https://github.com/sajari/word2vec

Go library for performing computations in word2vec binary models

embedding go golang word word2vec word2vec-model

Last synced: 08 May 2025

https://github.com/apple/embedding-atlas

Embedding Atlas is a tool that provides interactive visualizations for large embeddings. It allows you to visualize, cross-filter, and search embeddings and metadata.

embedding visualization

Last synced: 28 Jun 2025

https://github.com/cair/pytsetlinmachine

Implements the Tsetlin Machine, Convolutional Tsetlin Machine, Regression Tsetlin Machine, Weighted Tsetlin Machine, and Embedding Tsetlin Machine, with support for continuous features, multigranularity, clause indexing, and literal budget

bandit-learning classification convolution embedding frequent-pattern-mining interpretable machine-learning propositional-logic regression rule-based tsetlin-machine

Last synced: 09 Apr 2025

https://github.com/cair/pyTsetlinMachine

Implements the Tsetlin Machine, Convolutional Tsetlin Machine, Regression Tsetlin Machine, Weighted Tsetlin Machine, and Embedding Tsetlin Machine, with support for continuous features, multigranularity, clause indexing, and literal budget

bandit-learning classification convolution embedding frequent-pattern-mining interpretable machine-learning propositional-logic regression rule-based tsetlin-machine

Last synced: 04 May 2025

https://github.com/datawhalechina/all-in-rag

🔍大模型应用开发实战:RAG技术全栈指南,在线阅读地址:datawhalechina.github.io/all-in-rag/

ai embedding kimi-k2 langchain llama-index llm milvus multimodal neo4j python rag

Last synced: 11 Sep 2025

https://github.com/awslabs/amazon-denseclus

Clustering for mixed-type data

clustering embedding machinelearning-python python

Last synced: 06 May 2025

https://github.com/microsoft/msvbase

MSVBASE is a system that efficiently supports complex queries of both approximate similarity search and relational operators. It integrates high-dimensional vector indices into PostgreSQL, a relational database to facilitate complex approximate similarity queries.

database embedding hnsw postgresql retrieval-augmented-generation sptag vector vector-database

Last synced: 05 Aug 2025

https://github.com/firstbatchxyz/hollowdb-vector

A decentralized vector database for building vector search applications

ai ann arweave blockchain decentralized embedding embeddings nearest-neighbor-search vector-database

Last synced: 03 May 2025

https://github.com/benedekrozemberczki/BANE

A sparsity aware implementation of "Binarized Attributed Network Embedding" (ICDM 2018).

bane deepwalk diff2vec dimensionality-reduction embedding factorization fscnmf gemsec graph graph2vec icdm lane line musae node node2vec svd tadw tridnr word2vec

Last synced: 17 Apr 2025

https://github.com/benedekrozemberczki/bane

A sparsity aware implementation of "Binarized Attributed Network Embedding" (ICDM 2018).

bane deepwalk diff2vec dimensionality-reduction embedding factorization fscnmf gemsec graph graph2vec icdm lane line musae node node2vec svd tadw tridnr word2vec

Last synced: 01 Aug 2025

https://github.com/julien040/hn-recommendation-api

A recommendation system for Hacker News. Get the most similar posts for a given URL

embedding embeddings faiss hacker-news hnsw nextjs openai recommendation

Last synced: 14 Apr 2025

https://github.com/maja42/ember

Embed arbitrary resources into a go executable at runtime, after the executable has been built.

embed embedding go

Last synced: 16 Jan 2026

https://github.com/thesage21/lorentz-embeddings

Embed arbitrary graphs in Hyperbolic space

embedding hyperbolic-geometry lorentz pytorch

Last synced: 30 Oct 2025

https://github.com/thiswillbeyourgithub/anna_anki_neuronal_appendix

Using machine learning on your anki collection to enhance the scheduling via semantic clustering and semantic similarity

ai anki bert clustering doc2vec embedding flashcards kmeans latent machinelearning neighbourhood nlp pca sbert scheduler sementics sentence-embeddings umap

Last synced: 10 Apr 2025

https://github.com/snakers4/playing_with_vae

Comparing FC VAE / FCN VAE / PCA / UMAP on MNIST / FMNIST

beginner embedding fashion-mnist mnist pca python3 pytorch tutorial umap variational-autoencoder

Last synced: 13 Jul 2025

https://github.com/logicjake/tuling-video-click-top3

图灵联邦视频点击预测大赛线上第三-【ctr, embedding, 穿越特征】

ctr embedding

Last synced: 30 Apr 2025

https://github.com/LogicJake/tuling-video-click-top3

图灵联邦视频点击预测大赛线上第三-【ctr, embedding, 穿越特征】

ctr embedding

Last synced: 20 Jul 2025

https://github.com/nianticlabs/image-box-overlap

[ECCV 2020] Training neural networks to predict visual overlap of images, through interpretable non-metric box embeddings

deep-learning deeplearning embedding metric-learning non-metric nonmetric representation-learning surface-overlap visual-overlap

Last synced: 14 Apr 2025

https://github.com/MAGICS-LAB/DNABERT_S

DNABERT_S: Learning Species-Aware DNA Embedding with Genome Foundation Models

dna dna-embedding embedding

Last synced: 15 Aug 2025

https://github.com/milvus-io/milvus-model

A library integrating embedding and reranker models from OpenAI, SentenceTransformers etc for semantic search in vector database.

ai embedding nlp rag reranker

Last synced: 10 Oct 2025

https://github.com/lzjever/lexilux

Unified LLM API client library for Python. Simple API for Chat, Embedding, Rerank, and Tokenizer. OpenAI-compatible with streaming support and unified usage tracking.

api-client chat-api document-ranking embedding function-api llm openai-api openai-compatible python rerank reranker semantic-search streaming tokenizer

Last synced: 26 Jan 2026

https://github.com/dim13/file2go

Dead-simple file embedding tool for Go

embedding file golang

Last synced: 14 Jan 2026

https://github.com/relaterai/relater

Get AI newsletter recommendations tailored to developers and startups using ChatGPT prompt.

embedding gpt-4 langchain newsletter openai pgvector prisma prompt

Last synced: 31 Jul 2025

https://github.com/fredsiika/huxley-pdf

Upload personal docs and Chat with your PDF files with this GPT4-powered app. Built with LangChain, Pinecone Vector Database, deployed on Streamlit

chatgpt chatpdf embedding embedding-vectors langchain langchain-python openai pinecone vector-database

Last synced: 15 Apr 2025

https://github.com/bacpop/mandrake

Mandrake 🌿/👨‍🔬🦆 – Fast visualisation of the population structure of pathogens using Stochastic Cluster Embedding

cuda embedding genomics gpu pathogens

Last synced: 30 Oct 2025