An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with rag-evaluation

A curated list of projects in awesome lists tagged with rag-evaluation .

https://github.com/marker-inc-korea/autorag

AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation

analysis automl benchmarking document-parser embeddings evaluation llm llm-evaluation llm-ops open-source ops optimization pipeline python qa rag rag-evaluation retrieval-augmented-generation

Last synced: 03 Apr 2026

https://github.com/agenta-ai/agenta

The open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM observability all in one place.

agents evaluation llm-as-a-judge llm-evaluation llm-framework llm-monitoring llm-observability llm-platform llm-playground llm-tools llmops observability prompt-engineering prompt-management rag-evaluation

Last synced: 11 Mar 2026

https://github.com/Agenta-AI/agenta

The all-in-one LLM developer platform: prompt management, evaluation, human feedback, and deployment all in one place.

human-annotation langchain large-language-models llama-index llm llm-evaluation llm-framework llm-tools llmops llms prompt-engineering prompt-management prompt-toolkit rag rag-evaluation

Last synced: 13 Mar 2025

https://github.com/oztrkoguz/rag-framework-evaluation

This project aims to compare different Retrieval-Augmented Generation (RAG) frameworks in terms of speed and performance.

autogen autogen-rag crewai crewai-rag langchain langchain-rag llamaindex llamaindex-rag rag rag-evaluation swarms swarms-rag

Last synced: 20 Mar 2025

https://github.com/xmpuspus/kb-arena

Benchmark 7 retrieval strategies on your own docs — naive vector, contextual, QnA pairs, knowledge graph, RAPTOR, PageIndex, and hybrid. Find which KB architecture fits your data.

benchmark chromadb cli document-retrieval evaluation graphrag hybrid-search knowledge-graph llm neo4j python rag rag-evaluation retrieval retrieval-augmented-generation vector-search

Last synced: 02 May 2026

https://github.com/shaadclt/evalrag

A comprehensive evaluation toolkit for assessing Retrieval-Augmented Generation (RAG) outputs using linguistic, semantic, and fairness metrics

rag rag-evaluation

Last synced: 22 Jul 2025

https://github.com/kaos599/betterrag

BetterRAG: Powerful RAG evaluation toolkit for LLMs. Measure, analyze, and optimize how your AI processes text chunks with precision metrics. Perfect for RAG systems, document processing, and embedding quality assessment.

chunking-optimization embeddings embeddings-extraction embeddings-optimization evaluation evaluation-framework optimization rag rag-application rag-evaluation rag-optimization

Last synced: 05 May 2026

https://github.com/unshdee/proofrag

Point your agent at your docs and your RAG app; get a golden test set + an LLM-as-judge & retrieval scorecard, in one command.

agent-skills ci claude claude-code codex evaluation llm llm-as-judge python rag rag-evaluation retrieval

Last synced: 01 Jun 2026

https://github.com/keitabroadwater/llm-eval-lab

A web sandbox for hands-on learning of LLM and RAG Evaluation

evaluation-framework fastapi gpt4 llm-evaluation llmops nextjs rag-evaluation ragas

Last synced: 19 Apr 2026

https://github.com/alexmartin1722/mirage

An evaluation framework for evaluating any modality to text generation and multimodal RAG.

multimodal multimodal-rag multimodal-summarization rag rag-evaluation

Last synced: 14 May 2026