Projects in Awesome Lists tagged with document-retrieval
A curated list of projects in awesome lists tagged with document-retrieval .
https://github.com/chroma-core/chroma
the AI-native open-source embedding database
document-retrieval embeddings llms
Last synced: 05 May 2026
https://github.com/vearch/vearch
Distributed vector search for AI-native applications
ai-native ai-native-database cloud-native document-retrieval embeddings hybrid-search rag retrieval-augmented-generation vector-database vector-search vectors
Last synced: 13 May 2025
https://github.com/mintplex-labs/vector-admin
The universal tool suite for vector database management. Manage Pinecone, Chroma, Qdrant, Weaviate and more vector databases with ease.
ai ai-agents aitools chroma database-management document-retrieval embeddings flowise langchain langchain-js llms pinecone qdrant vector-data-management vector-database vector-database-embedding vector-search vectordatabase vectorspace weaviate
Last synced: 31 Oct 2025
https://github.com/Mintplex-Labs/vector-admin
The universal tool suite for vector database management. Manage Pinecone, Chroma, Qdrant, Weaviate and more vector databases with ease.
ai ai-agents aitools chroma database-management document-retrieval embeddings flowise langchain langchain-js llms pinecone qdrant vector-data-management vector-database vector-database-embedding vector-search vectordatabase vectorspace weaviate
Last synced: 24 Mar 2025
https://github.com/openbmb/visrag
Parsing-free RAG supported by VLMs
document-retrieval document-understanding multi-modal multi-modality rag retrieval retrieval-augmented-generation vision-language-model
Last synced: 05 Oct 2025
https://github.com/redis-developer/redis-arxiv-search
Vector search demo with the arXiv paper dataset, RedisVL, HuggingFace, OpenAI, Cohere, FastAPI, React, and Redis.
arxiv arxiv-papers cohere document-retrieval document-search huggingface machine-learning nlp openai react redis vector-database vector-search
Last synced: 17 Sep 2025
https://github.com/grafana/vectorapi
pgvector + embeddings API
document-retrieval embeddings llms pgvector
Last synced: 19 Oct 2025
https://github.com/syed007hassan/document-querying-with-vectordb
Document Querying with LLMs - Google PaLM API: Semantic Search With LLM Embeddings
chroma document-retrieval embeddings palm-api pdf-encoding vectordb
Last synced: 10 Apr 2025
https://github.com/xmpuspus/kb-arena
Benchmark 7 retrieval strategies on your own docs — naive vector, contextual, QnA pairs, knowledge graph, RAPTOR, PageIndex, and hybrid. Find which KB architecture fits your data.
benchmark chromadb cli document-retrieval evaluation graphrag hybrid-search knowledge-graph llm neo4j python rag rag-evaluation retrieval retrieval-augmented-generation vector-search
Last synced: 02 May 2026
https://github.com/aniketwdubey/chatpdf
This project is a Document Retrieval application that utilizes Retrieval-Augmented Generation (RAG) techniques to enable users to interact with uploaded PDF documents. By leveraging a Large Language Model (LLM), users can ask questions about the content of the documents and receive accurate answers based on the information retrieved.
chat-application document-retrieval fastapi huggingface large-language-models llm python rag retrieval-augmented-generation
Last synced: 11 Apr 2025
https://github.com/debanjansarkar/askdoc
The Intelligent "ASKDOC" project combines the power of Langchain, Azure, OpenAI models, and Python to deliver an intelligent question-answering system, that scans your PDF documents and answer queries based on its contents. It can be queried using Human Natural Language.
artificial-intelligence azure-openai azure-openai-api chatbot document-retrieval faiss langchain langchain-python natural-language-processing natural-language-understanding pdf-document-query python3
Last synced: 10 Oct 2025
https://github.com/boudinfl/redefining-absent-keyphrases
Code and dataset for the paper "Redefining Absent Keyphrases and their Effect on Retrieval Effectiveness"
absent-keyphrases digital-library document-retrieval information-retrieval keyphrase-generation retrieval-effectiveness
Last synced: 13 Apr 2025
https://github.com/subhangisati/langchat-explorer
"LangChat Explorer: Your intuitive document companion. Effortlessly explore vast information with natural language conversations. Simplify queries, gain insights, and embark on a seamless journey of knowledge discovery. Unleash the power of language with LangChat Explorer."
api deep-learning document-retrieval generative-ai llms machine-learning pdf-document-processor python3 q-and-a-bot
Last synced: 04 Apr 2026
https://github.com/timothyckl/iota
a minimal local embedding database.
document-retrieval embeddings python vector-database vector-search
Last synced: 14 Jan 2026
https://github.com/mixpeek/multimodal-benchmarks
Open evaluation suite for multimodal retrieval systems with benchmarks for financial documents, medical devices, and educational content
benchmark document-retrieval embeddings evaluation hybrid-search information-retrieval multimodal-retrieval nlp ocr rag semantic-search table-extraction vector-search
Last synced: 12 Mar 2026
https://github.com/md-emon-hasan/retrieval-augmented-generation-rag
RAG enhances LLMs by retrieving relevant external knowledge before generating responses, improving accuracy and reducing hallucinations.
ai-chatbot chromadb custom-llm document-retrieval embedding-models faiss huggingface-rag knowledge-augmented-llm knowledge-graph langchain-rag llm-applications llm-retrieval multi-modal-rag prompt-engineering rag-pipeline retrieval-augmented-generation retrieval-qa semantic-search text-embedding vector-search
Last synced: 02 Mar 2025
https://github.com/eztakesin/llm-mcp-gateway-rs
Rust LLM gateway compatible with the OpenAI Responses API, integrating MCP (local Docs/PDF KB + extensible tool/database MCP servers) with a tool-loop, and compatible with Big-AGI. Rust 实现的 LLM 网关:兼容 OpenAI Responses API,集成 MCP(本地 Docs/PDF 知识库 + 可扩展的工具/数据库 MCP),支持 tool loop,并适配 Big-AGI。
ai-gateway axum database document-retrieval knowledge-base llm mcp model-context-protocol openai-api openai-compatible oracle pdf rag responses-api reverse-proxy rust sqlite sse stdio tokio
Last synced: 09 Jun 2026
https://github.com/aadityarajgupta/aethercare_platform
AetherCare is an AI-powered healthcare platform that leverages Generative AI to assist users with medical inquiries, symptom-based disease prediction, hospital location services, and a knowledge repository for healthcare education. This project aims to enhance accessibility to healthcare information.
ai-chatbot blog-article document-retrieval flask generative-ai healthcare hospital-finder langchain-python machine-learning pinecone speech-to-text svc-model transformers
Last synced: 18 Sep 2025
https://github.com/lh0x00/embs
embs is a Python toolkit for retrieving documents (via Docsifer), generating embeddings (via Lightweight Embeddings API), and ranking texts with an optional caching system.
docsifer document-retrieval embeddings embs markitdown openai rag ranking
Last synced: 08 May 2026
https://github.com/phreakyphoenix/graphlab-ml-projects
These individiual projects are part of the coursework for the Coursera University of Washington course. I've also tried some fun new experiments with the data. Have fun checking them out !!
coursera deep-learning document-retrieval image-classification image-retrieval machine-learning prediction-model python27 sentiment-analysis song-recommender
Last synced: 29 Apr 2026
https://github.com/aadityarajgupta/aethercare_chatbot
This repository contains a healthcare-based chatbot project that integrates advanced generative AI techniques with document retrieval for answering medical queries. It leverages vector-based search for relevant information retrieval and uses transformer-based models for generating responses.
ai-chatbot document-retrieval flask generative-ai healthcare langchain-python machine-learning nlp pinecone transformers
Last synced: 29 Apr 2026
https://github.com/huacenxu/embedding-models-for-ai-retrieval
This project develops a domain-specific embedding model to enhance document retrieval in AI-powered search systems. It incorporates techniques like synthetic data generation, model fine-tuning, and vector search using FAISS, evaluated with MRR@5 for performance.
document-retrieval embedding-models faiss machine-learning mrr nlp reallifeproject semantic-search
Last synced: 12 Mar 2025
https://github.com/prasannaghimiree/ai-powered-documentation-retrieval-assistant-for-open-source-projects
An AI tool that uses the Ollama Mistral LLM for understanding and summarizing code, with Ollama all-MiniLM-v3 embeddings stored in a ChromaDB vector database. It provides automatic documentation summaries, answers questions, and generates contribution guides to simplify onboarding and boost productivity for open-source developers
chat-app chat-application chatapp chatbot chatbots chatgpt document-retrieval langchain langchain-python mistral mistral-7b ollama ollama-api rag
Last synced: 09 Apr 2026