An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with document-retrieval

A curated list of projects in awesome lists tagged with document-retrieval .

https://github.com/chroma-core/chroma

the AI-native open-source embedding database

document-retrieval embeddings llms

Last synced: 05 May 2026

https://github.com/mintplex-labs/vector-admin

The universal tool suite for vector database management. Manage Pinecone, Chroma, Qdrant, Weaviate and more vector databases with ease.

ai ai-agents aitools chroma database-management document-retrieval embeddings flowise langchain langchain-js llms pinecone qdrant vector-data-management vector-database vector-database-embedding vector-search vectordatabase vectorspace weaviate

Last synced: 31 Oct 2025

https://github.com/Mintplex-Labs/vector-admin

The universal tool suite for vector database management. Manage Pinecone, Chroma, Qdrant, Weaviate and more vector databases with ease.

ai ai-agents aitools chroma database-management document-retrieval embeddings flowise langchain langchain-js llms pinecone qdrant vector-data-management vector-database vector-database-embedding vector-search vectordatabase vectorspace weaviate

Last synced: 24 Mar 2025

https://github.com/redis-developer/redis-arxiv-search

Vector search demo with the arXiv paper dataset, RedisVL, HuggingFace, OpenAI, Cohere, FastAPI, React, and Redis.

arxiv arxiv-papers cohere document-retrieval document-search huggingface machine-learning nlp openai react redis vector-database vector-search

Last synced: 17 Sep 2025

https://github.com/grafana/vectorapi

pgvector + embeddings API

document-retrieval embeddings llms pgvector

Last synced: 19 Oct 2025

https://github.com/syed007hassan/document-querying-with-vectordb

Document Querying with LLMs - Google PaLM API: Semantic Search With LLM Embeddings

chroma document-retrieval embeddings palm-api pdf-encoding vectordb

Last synced: 10 Apr 2025

https://github.com/xmpuspus/kb-arena

Benchmark 7 retrieval strategies on your own docs — naive vector, contextual, QnA pairs, knowledge graph, RAPTOR, PageIndex, and hybrid. Find which KB architecture fits your data.

benchmark chromadb cli document-retrieval evaluation graphrag hybrid-search knowledge-graph llm neo4j python rag rag-evaluation retrieval retrieval-augmented-generation vector-search

Last synced: 02 May 2026

https://github.com/aniketwdubey/chatpdf

This project is a Document Retrieval application that utilizes Retrieval-Augmented Generation (RAG) techniques to enable users to interact with uploaded PDF documents. By leveraging a Large Language Model (LLM), users can ask questions about the content of the documents and receive accurate answers based on the information retrieved.

chat-application document-retrieval fastapi huggingface large-language-models llm python rag retrieval-augmented-generation

Last synced: 11 Apr 2025

https://github.com/debanjansarkar/askdoc

The Intelligent "ASKDOC" project combines the power of Langchain, Azure, OpenAI models, and Python to deliver an intelligent question-answering system, that scans your PDF documents and answer queries based on its contents. It can be queried using Human Natural Language.

artificial-intelligence azure-openai azure-openai-api chatbot document-retrieval faiss langchain langchain-python natural-language-processing natural-language-understanding pdf-document-query python3

Last synced: 10 Oct 2025

https://github.com/boudinfl/redefining-absent-keyphrases

Code and dataset for the paper "Redefining Absent Keyphrases and their Effect on Retrieval Effectiveness"

absent-keyphrases digital-library document-retrieval information-retrieval keyphrase-generation retrieval-effectiveness

Last synced: 13 Apr 2025

https://github.com/subhangisati/langchat-explorer

"LangChat Explorer: Your intuitive document companion. Effortlessly explore vast information with natural language conversations. Simplify queries, gain insights, and embark on a seamless journey of knowledge discovery. Unleash the power of language with LangChat Explorer."

api deep-learning document-retrieval generative-ai llms machine-learning pdf-document-processor python3 q-and-a-bot

Last synced: 04 Apr 2026

https://github.com/timothyckl/iota

a minimal local embedding database.

document-retrieval embeddings python vector-database vector-search

Last synced: 14 Jan 2026

https://github.com/mixpeek/multimodal-benchmarks

Open evaluation suite for multimodal retrieval systems with benchmarks for financial documents, medical devices, and educational content

benchmark document-retrieval embeddings evaluation hybrid-search information-retrieval multimodal-retrieval nlp ocr rag semantic-search table-extraction vector-search

Last synced: 12 Mar 2026

https://github.com/eztakesin/llm-mcp-gateway-rs

Rust LLM gateway compatible with the OpenAI Responses API, integrating MCP (local Docs/PDF KB + extensible tool/database MCP servers) with a tool-loop, and compatible with Big-AGI. Rust 实现的 LLM 网关:兼容 OpenAI Responses API,集成 MCP(本地 Docs/PDF 知识库 + 可扩展的工具/数据库 MCP),支持 tool loop,并适配 Big-AGI。

ai-gateway axum database document-retrieval knowledge-base llm mcp model-context-protocol openai-api openai-compatible oracle pdf rag responses-api reverse-proxy rust sqlite sse stdio tokio

Last synced: 09 Jun 2026

https://github.com/aadityarajgupta/aethercare_platform

AetherCare is an AI-powered healthcare platform that leverages Generative AI to assist users with medical inquiries, symptom-based disease prediction, hospital location services, and a knowledge repository for healthcare education. This project aims to enhance accessibility to healthcare information.

ai-chatbot blog-article document-retrieval flask generative-ai healthcare hospital-finder langchain-python machine-learning pinecone speech-to-text svc-model transformers

Last synced: 18 Sep 2025

https://github.com/lh0x00/embs

embs is a Python toolkit for retrieving documents (via Docsifer), generating embeddings (via Lightweight Embeddings API), and ranking texts with an optional caching system.

docsifer document-retrieval embeddings embs markitdown openai rag ranking

Last synced: 08 May 2026

https://github.com/phreakyphoenix/graphlab-ml-projects

These individiual projects are part of the coursework for the Coursera University of Washington course. I've also tried some fun new experiments with the data. Have fun checking them out !!

coursera deep-learning document-retrieval image-classification image-retrieval machine-learning prediction-model python27 sentiment-analysis song-recommender

Last synced: 29 Apr 2026

https://github.com/aadityarajgupta/aethercare_chatbot

This repository contains a healthcare-based chatbot project that integrates advanced generative AI techniques with document retrieval for answering medical queries. It leverages vector-based search for relevant information retrieval and uses transformer-based models for generating responses.

ai-chatbot document-retrieval flask generative-ai healthcare langchain-python machine-learning nlp pinecone transformers

Last synced: 29 Apr 2026

https://github.com/huacenxu/embedding-models-for-ai-retrieval

This project develops a domain-specific embedding model to enhance document retrieval in AI-powered search systems. It incorporates techniques like synthetic data generation, model fine-tuning, and vector search using FAISS, evaluated with MRR@5 for performance.

document-retrieval embedding-models faiss machine-learning mrr nlp reallifeproject semantic-search

Last synced: 12 Mar 2025

https://github.com/prasannaghimiree/ai-powered-documentation-retrieval-assistant-for-open-source-projects

An AI tool that uses the Ollama Mistral LLM for understanding and summarizing code, with Ollama all-MiniLM-v3 embeddings stored in a ChromaDB vector database. It provides automatic documentation summaries, answers questions, and generates contribution guides to simplify onboarding and boost productivity for open-source developers

chat-app chat-application chatapp chatbot chatbots chatgpt document-retrieval langchain langchain-python mistral mistral-7b ollama ollama-api rag

Last synced: 09 Apr 2026