An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with semantic-search

A curated list of projects in awesome lists tagged with semantic-search .

https://github.com/microsoft/generative-ai-for-beginners

21 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/

ai azure chatgpt dall-e generative-ai generativeai gpt language-model llms microsoft-for-beginners openai prompt-engineering semantic-search transformers

Last synced: 12 May 2025

https://microsoft.github.io/generative-ai-for-beginners/

21 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/

ai azure chatgpt dall-e generative-ai generativeai gpt language-model llms microsoft-for-beginners openai prompt-engineering semantic-search transformers

Last synced: 29 Mar 2025

https://github.com/microsoft/generative-ai-for-beginners?WT.mc_id=academic-122979-leestott

21 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/

ai azure chatgpt dall-e generative-ai generativeai gpt language-model llms microsoft-for-beginners openai prompt-engineering semantic-search transformers

Last synced: 24 Mar 2025

https://github.com/khoj-ai/khoj

Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI (gpt, claude, gemini, llama, qwen, mistral). Get started - free.

agent ai assistant chat chatgpt emacs image-generation llama3 llamacpp llm obsidian obsidian-md offline-llm productivity rag research self-hosted semantic-search stt whatsapp-ai

Last synced: 12 May 2025

https://github.com/typesense/typesense

Open Source alternative to Algolia + Pinecone and an Easier-to-Use alternative to ElasticSearch ⚡ 🔍 ✨ Fast, typo tolerant, in-memory fuzzy Search Engine for building delightful search experiences

algolia datastore elasticsearch enterprise-search faceting full-text-search fuzzy-search geosearch in-memory instantsearch merchandising pinecone search search-engine semantic-search similarity-search site-search synonyms typo-tolerance vector-search

Last synced: 13 May 2025

https://github.com/deepset-ai/haystack

AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.

agents ai bert chatgpt generative-ai information-retrieval language-model large-language-models llm machine-learning nlp python pytorch question-answering rag retrieval-augmented-generation semantic-search squad summarization transformers

Last synced: 12 May 2025

https://github.com/arc53/docsgpt

DocsGPT is an open-source genAI tool that helps users get reliable answers from knowledge source, while avoiding hallucinations. It enables private and reliable information retrieval, with tooling and agentic system capability built in.

ai chatgpt docsgpt hacktoberfest information-retrieval language-model llm machine-learning natural-language-processing python pytorch rag react semantic-search transformers web-app

Last synced: 12 May 2025

https://github.com/arc53/DocsGPT

DocsGPT is an open-source genAI tool that helps users get reliable answers from knowledge source, while avoiding hallucinations. It enables private and reliable information retrieval, with tooling and agentic system capability built in.

ai chatgpt docsgpt hacktoberfest information-retrieval language-model llm machine-learning natural-language-processing python pytorch rag react semantic-search transformers web-app

Last synced: 14 Mar 2025

https://github.com/weaviate/weaviate

Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native database​.

approximate-nearest-neighbor-search generative-search grpc hnsw hybrid-search image-search information-retrieval mlops nearest-neighbor-search neural-search recommender-system search-engine semantic-search semantic-search-engine similarity-search vector-database vector-search vector-search-engine vectors weaviate

Last synced: 12 May 2025

https://github.com/semi-technologies/weaviate

Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native database​.

approximate-nearest-neighbor-search generative-search grpc hnsw hybrid-search image-search information-retrieval mlops nearest-neighbor-search neural-search recommender-system search-engine semantic-search semantic-search-engine similarity-search vector-database vector-search vector-search-engine vectors weaviate

Last synced: 10 Dec 2024

https://github.com/lancedb/lancedb

Developer-friendly, embedded retrieval engine for multimodal AI. Search More; Manage Less.

approximate-nearest-neighbor-search image-search nearest-neighbor-search recommender-system search-engine semantic-search similarity-search vector-database

Last synced: 12 May 2025

https://lancedb.github.io/lancedb/

Developer-friendly, embedded retrieval engine for multimodal AI. Search More; Manage Less.

approximate-nearest-neighbor-search image-search nearest-neighbor-search recommender-system search-engine semantic-search similarity-search vector-database

Last synced: 04 May 2025

https://github.com/gmpetrov/databerry

The no-code platform for building custom LLM Agents

ai aichatbot chatbot chatbots chatgpt llm no-code openai qdrant semantic-search typescript

Last synced: 14 May 2025

https://github.com/pinecone-io/examples

Jupyter Notebooks to help you get hands-on with Pinecone vector databases

ai jupyter-notebook llm python semantic-search vector-database

Last synced: 13 May 2025

https://github.com/mazzzystar/queryable

Run OpenAI's CLIP and Apple's MobileCLIP model on iOS to search photos.

clip-model ios macos mobile mobile-clip mobileclip natural-language-image-search openai-clip photos search semantic-search swiftui

Last synced: 14 May 2025

https://github.com/mazzzystar/Queryable

Run OpenAI's CLIP and Apple's MobileCLIP model on iOS to search photos.

clip-model ios macos mobile mobile-clip mobileclip natural-language-image-search openai-clip photos search semantic-search swiftui

Last synced: 26 Mar 2025

https://github.com/unum-cloud/usearch

Fast Open-Source Search & Clustering engine × for Vectors & 🔜 Strings × in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, C#, GoLang, and Wolfram 🔍

approximate-nearest-neighbor-search clustering database faiss full-text-search fuzzy-search image-search kann nearest-neighbor-search recommender-system search search-engine semantic-search simd similarity-search text-search vector-search webassembly

Last synced: 29 Mar 2025

https://github.com/freedmand/semantra

Multi-tool for semantic search

cli machine-learning semantic-search

Last synced: 14 May 2025

https://github.com/rom1504/clip-retrieval

Easily compute clip embeddings and build a clip retrieval system with them

ai clip deep-learning knn multimodal semantic-search

Last synced: 14 May 2025

https://rom1504.github.io/clip-retrieval/?back=https%3A%2F%2Fknn5.laion.ai&index=laion5B&useMclip=false

Easily compute clip embeddings and build a clip retrieval system with them

ai clip deep-learning knn multimodal semantic-search

Last synced: 08 May 2025

https://github.com/microsoft/kernel-memory

RAG architecture: index and query any data using LLM and natural language, track sources, show citations, asynchronous memory patterns.

indexing llm memory rag semantic-search

Last synced: 13 May 2025

https://github.com/notjoemartinez/yt-fts

YouTube Full Text Search - Search all of a YouTube channel from the command line

chromadb cli click full-text-search llm rag semantic-search sqlite youtube yt-dlp

Last synced: 14 May 2025

https://github.com/NotJoeMartinez/yt-fts

YouTube Full Text Search - Search all of a YouTube channel from the command line

chromadb cli click full-text-search llm rag semantic-search sqlite youtube yt-dlp

Last synced: 24 Mar 2025

https://github.com/aws-samples/aws-genai-llm-chatbot

A modular and comprehensive solution to deploy a Multi-LLM and Multi-RAG powered chatbot (Amazon Bedrock, Anthropic, HuggingFace, OpenAI, Meta, AI21, Cohere, Mistral) using AWS CDK on AWS

amazon-bedrock aurora aws bedrock cdk chatbot claude genai huggingface idefics kendra langchain llm opensearch opensearch-serverless pgvector sagemaker semantic-search vectordb

Last synced: 14 May 2025

https://github.com/unum-cloud/uform

Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️

bert clip clustering contrastive-learning cross-attention huggingface-transformers image-search language-vision llava multi-lingual multimodal neural-network openai openclip pretrained-models pytorch representation-learning semantic-search transformer vector-search

Last synced: 14 May 2025

https://github.com/model-zoo/shift-ctrl-f

🔎 Search the information available on a webpage using natural language instead of an exact string match.

bert chrome-extension natural-language question-answering semantic-search tensorflow tensorflowjs unpacked-extension

Last synced: 18 Jan 2025

https://github.com/cocoindex-io/cocoindex

ETL framework to turn your data AI-ready - with realtime incremental updates and support custom logic like lego.

ai change-data-capture data data-engineering data-indexing data-infrastructure data-processing dataflow etl help-wanted indexing knowledge-graph llm pipeline python rag real-time rust semantic-search streaming

Last synced: 14 May 2025

https://github.com/Dicklesworthstone/swiss_army_llama

A FastAPI service for semantic text search using precomputed embeddings and advanced similarity measures, with built-in support for various file types through textract.

embedding-similarity embedding-vectors embeddings llama2 llamacpp semantic-search

Last synced: 09 Apr 2025

https://github.com/dicklesworthstone/swiss_army_llama

A FastAPI service for semantic text search using precomputed embeddings and advanced similarity measures, with built-in support for various file types through textract.

embedding-similarity embedding-vectors embeddings llama2 llamacpp semantic-search

Last synced: 15 May 2025

https://github.com/superlinked/superlinked

A compute framework for building Search, RAG, Recommendations and Analytics over complex structured & unstructured data.

data-pipeline deep-learning embeddings etl information-retrieval llm ml mlops natural-language-processing nlp python retrieval retrieval-augmented-generation semantic-search vector-database vector-search vectorization

Last synced: 13 Mar 2025

https://github.com/prithivirajdamodaran/flashrank

Lite & Super-fast re-ranking for your search & retrieval pipelines. Supports SoTA Listwise and Pairwise reranking based on LLMs and cross-encoders and more. Created by Prithivi Da, open for PRs & Collaborations.

cross-encoder full-text-search hybrid-search lexical-search rag ranking reranking retrieval-augmented-generation semantic-search vector-database vector-search

Last synced: 14 May 2025

https://github.com/hayabhay/frogbase

Transform audio-visual content into navigable knowledge.

embeddings package python search semantic-search speech-to-text streamlit ui

Last synced: 20 Jan 2025

https://github.com/intellabs/rag-fit

Framework for enhancing LLMs for RAG tasks using fine-tuning.

evaluation fine-tuning information-retrieval llm nlp question-answering rag semantic-search

Last synced: 15 May 2025

https://github.com/koursaros-ai/nboost

NBoost is a scalable, search-api-boosting platform for deploying transformer models to improve the relevance of search results on different platforms (i.e. Elasticsearch)

cloud deep-learning docker elasticsearch helm kubernetes machine-learning microservices nboost nlp proxy python pytorch search-api search-engine semantic-search tensorflow

Last synced: 30 Mar 2025

https://github.com/PrithivirajDamodaran/FlashRank

Lite & Super-fast re-ranking for your search & retrieval pipelines. Supports SoTA Listwise and Pairwise reranking based on LLMs and cross-encoders and more. Created by Prithivi Da, open for PRs & Collaborations.

cross-encoder full-text-search hybrid-search lexical-search rag ranking reranking retrieval-augmented-generation semantic-search vector-database vector-search

Last synced: 08 May 2025

https://github.com/qdrant/mcp-server-qdrant

An official Qdrant Model Context Protocol (MCP) server implementation

claude cursor llm mcp mcp-server semantic-search windsurf

Last synced: 16 May 2025

https://github.com/jina-ai/examples

Jina examples and demos to help you get started

deep-learning examples jina neural-search nlp onboarding python semantic-search tutorials

Last synced: 30 Mar 2025

https://github.com/kelindar/search

Go library for embedded vector search and semantic embeddings using llama.cpp

ai bert embeddings gguf gpu llamacpp search-engine semantic-search simd vector-search

Last synced: 16 May 2025

https://github.com/alexklibisz/elastiknn

Elasticsearch plugin for nearest neighbor search. Store vectors and run similarity search using exact and approximate algorithms.

elasticsearch elasticsearch-plugin embeddings locality-sensitive-hashing lucene nearest-neighbor-search neural-search semantic-search similarity-search

Last synced: 14 May 2025

https://github.com/JohnGiorgi/DeCLUTR

The corresponding code from our paper "DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations". Do not hesitate to open an issue if you run into any trouble!

allennlp contrastive-learning metric-learning natural-language-processing pytorch representation-learning self-supervised-learning semantic-search semantic-text-similarity sentence-embeddings sentence-similarity transformers

Last synced: 30 Mar 2025

https://github.com/aryn-ai/sycamore

🍁 Sycamore is an LLM-powered search and analytics platform for unstructured data.

ai dataprep etl information-retrieval llm ml nlp opensearch search semantic-search

Last synced: 10 Dec 2024

https://github.com/askaitools/askaitools-community-edition

A cutting-edge search engine project tailored specifically for the AI product

ai embedding enterprise-search full-text-search hybrid-search search search-engine semantic-search tools

Last synced: 24 Mar 2025

https://github.com/deepset-ai/haystack-tutorials

Here you can find all the Tutorials for Haystack 📓

generative-qa haystack llm nlp semantic-search text-generation tutorials

Last synced: 08 Apr 2025

https://github.com/alondmnt/joplin-plugin-jarvis

Joplin (note-taking) assistant running a very intelligent system (GPT, Claude, Gemini, Ollama, Hugging Face)

assistant joplin-plugin llm note-taking semantic-search

Last synced: 09 Apr 2025

https://github.com/ZachNagengast/similarity-search-kit

🔎 SimilaritySearchKit is a Swift package providing on-device text embeddings and semantic search functionality for iOS and macOS applications.

apple-neural-engine coreml information-retrieval nlp pretrained-models question-answering semantic-search semantic-similarity swift text-embeddings vector-embeddings

Last synced: 09 Apr 2025

https://github.com/zachnagengast/similarity-search-kit

🔎 SimilaritySearchKit is a Swift package providing on-device text embeddings and semantic search functionality for iOS and macOS applications.

apple-neural-engine coreml information-retrieval nlp pretrained-models question-answering semantic-search semantic-similarity swift text-embeddings vector-embeddings

Last synced: 07 Apr 2025

https://github.com/zilliztech/akcio

Akcio is a demonstration project for Retrieval Augmented Generation (RAG). It leverages the power of LLM to generate responses and uses vector databases to fetch relevant documents to enhance the quality and relevance of the output.

artificial-intelligence chatbot chatgpt dolly embeddings ernie-bot fastapi gradio langchain llm milvus minimax nlp openai retrieval-augmented-generation retrieval-chatbot semantic-search towhee

Last synced: 29 Nov 2024

https://github.com/pinecone-io/pinecone-ts-client

The official TypeScript/Node client for the Pinecone vector database

llm pinecone semantic-search similarity-search vector-database

Last synced: 14 May 2025

https://github.com/nitaiaharoni1/vector-storage

Vector Storage is a vector database that enables semantic similarity searches on text documents in the browser's local storage. It uses OpenAI embeddings to convert documents into vectors and allows searching for similar documents based on cosine similarity.

cosine-similarity embedding-vectors javascript local-storage localstorage lru-cache npm open-source openai semantic-search semantic-similarity typescript vector-database vector-db vector-search vector-similarity vector-similarity-database vector-similarity-search

Last synced: 16 May 2025

https://github.com/Hellisotherpeople/CX_DB8

a contextual, biasable, word-or-sentence-or-paragraph extractive summarizer powered by the latest in text embeddings (Bert, Universal Sentence Encoder, Flair)

contextual-summarization cuda debate-evidence embeddings extractive-summarization flair python semantic-search semantic-summarization summarization summarizer token-level-summarization universal-sentence-encoder

Last synced: 22 Nov 2024

https://github.com/fzliu/radient

Radient turns many data types (not just text) into vectors for similarity search, clustering, regression analysis, and more.

audio embeddings etl fraud-detection graphs image-search images milvus molecular-search molecules recommender-system retrieval-augmented-generation semantic-search similarity-search text unstructured-data-etl vector-database vectors

Last synced: 06 Apr 2025

https://github.com/intelligentnode/IntelliNode

Access the latest AI models like ChatGPT, LLaMA, Diffusion, Gemini Hugging face, and beyond through a unified prompt layer and performance evaluation

anthropic chatbot chatgpt claude dall-e embeddings gemini google-ai gpt-4 hugging-face image-generation language-model mistralai nodejs openai prompt-engineering semantic-search speech-synthesis vectors

Last synced: 04 Apr 2025

https://github.com/intelligentnode/intellinode

Access the latest AI models like ChatGPT, LLaMA, Diffusion, Gemini Hugging face, and beyond through a unified prompt layer and performance evaluation

anthropic chatbot chatgpt claude dall-e embeddings gemini google-ai gpt-4 hugging-face image-generation language-model mistralai nodejs openai prompt-engineering semantic-search speech-synthesis vectors

Last synced: 05 Apr 2025

https://github.com/Ravn-Tech/HyperTag

HyperTag - Intuitive Knowledge Management WebApp & CLI for Humans using Deep Learning & Tags

file filesystem image-retrieval images knowledge-management organization pdf search search-engine search-images search-text semantic-search semantic-similarity tagging tags

Last synced: 09 Dec 2024

https://github.com/do-me/semanticfinder

SemanticFinder - frontend-only live semantic search with transformers.js

ai codemirror semantic-search semanticsearch transformers

Last synced: 13 Apr 2025

https://github.com/do-me/SemanticFinder

SemanticFinder - frontend-only live semantic search with transformers.js

ai codemirror semantic-search semanticsearch transformers

Last synced: 14 Apr 2025

https://github.com/kuutsav/information-retrieval

Neural information retrieval / Semantic search / Bi-encoders

information-retrieval machine-learning nlp semantic-search

Last synced: 08 May 2025

https://github.com/doobidoo/mcp-memory-service

MCP server providing semantic memory and persistent storage capabilities for Claude using ChromaDB and sentence transformers.

chroma-db knowledge localstorage mcp memory semantic-search server tag-based time-based vector-database

Last synced: 10 Apr 2025

https://github.com/dmotz/emdash

📚🧙‍♂️ Wisdom indexer — use AI to organize text snippets so you can actually remember & learn from what you read

ai books ebook ebooks elm embeddings epub kindle kindle-clippings kindle-highlights literature ml nlp notes reading semantic-search

Last synced: 27 Jan 2025

https://github.com/mihaiii/semantic-autocomplete

A blazing-fast semantic search React component. Match by meaning, not just by letters. Search as you type without waiting (no debounce needed). Rank by cosine similarity.

cosine-similarity material-ui nlp nlp-machine-learning react semantic semantic-search

Last synced: 12 Apr 2025

https://github.com/DRSY/MoTIS

[NAACL 2022]Mobile Text-to-Image search powered by multimodal semantic representation models(e.g., OpenAI's CLIP)

ai clip cross-modal image-search ios-swift k-means k-means-clustering knn knowledge-distillation lsh naacl random-projection retrieval semantic-search vector-search

Last synced: 08 May 2025

https://github.com/deepset-ai/haystack-demos

Fully working applications that demonstrate how to use Haystack to implement various use cases

demo-app haystack haystack-ai llms nlp python question-answering rest-api semantic-search

Last synced: 05 Apr 2025

https://github.com/nomic-ai/semantic-search-app-template

Tutorial and template for a semantic search app powered by the Atlas Embedding Database, Langchain, OpenAI and FastAPI

fastapi openai react semantic-search tutorial

Last synced: 11 Apr 2025

https://github.com/transitive-bullshit/bens-bites-ai-search

AI search for all the best resources in AI – powered by Ben's Bites 💯

ai beehiiv ml newsletters search semantic-search

Last synced: 14 Apr 2025

https://github.com/ashvardanian/swiftsemanticsearch

Real-time on-device text-to-image and image-to-image Semantic Search with video stream camera capture using USearch & UForm AI Swift SDKs for Apple devices 🍏

coreml coreml-models image-search ios mobile-app ondeviceai onnx rag semantic-search swift swiftui vector-search vector-search-engine video-search

Last synced: 05 Apr 2025

https://github.com/foxminchan/LawKnowledge

A legal knowledge search and Q&A application based on Vietnam's Legal Code and legal document database ⚖️

generative-ai microservice natural-language-processing nlp nx searching semantic-search

Last synced: 10 May 2025

https://github.com/sinanuozdemir/oreilly-retrieval-augmented-gen-ai

See how to augment LLMs with real-time data for dynamic, context-aware apps - Rag + Agents + GraphRAG.

graphrag llms open-source rag semantic-search

Last synced: 05 Apr 2025

https://github.com/acheong08/vectordb

A simple vector database: Text encoding, semantic search, document storage

golang semantic-search vector-database

Last synced: 03 Apr 2025

https://github.com/cluebenchmark/qbqtc

QBQTC: 大规模搜索匹配数据集

chinese-dataset query search semantic-search semantic-similarity

Last synced: 05 May 2025

https://github.com/unmonoqueteclea/voilib

🎧 Podcast Search Engine. Try it now for free or run your own instance.

fastapi podcast search-engine semantic-search svelte

Last synced: 10 Apr 2025