https://github.com/IAAR-Shanghai/CRUD_RAG

CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented Generation of Large Language Models
https://github.com/IAAR-Shanghai/CRUD_RAG

benchmark large-language-models retrieval-augmented-generation

Last synced: about 1 year ago
JSON representation

CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented Generation of Large Language Models

Awesome-LLMs-Datasets - https://github.com/IAAR-Shanghai/CRUD_RAG
StarryDivineSky - IAAR-Shanghai/CRUD_RAG - RAG：大型语言模型检索增强生成的综合中文基准。本项目全面支持中文 RAG 系统评价，包括中文原生数据集、评价任务和基线模型;它涵盖了 CRUD（创建、读取、更新、删除）操作，这些操作用于评估 RAG 系统添加、减少、更正信息以及根据检索信息回答问题的能力;它包含 36166 个测试样本，这是可用的中国 RAG 测试数量最多的;支持 ROUGE、BLEU、bertScore、RAGQuestEval 等多种评价指标，并提供一键式评价功能; (A01_文本生成_文本对话 / 大语言对话模型及数据)

ecosyste.ms