{"id":13628911,"url":"https://github.com/chen700564/RGB","last_synced_at":"2025-04-17T04:32:41.643Z","repository":{"id":192966526,"uuid":"686916495","full_name":"chen700564/RGB","owner":"chen700564","description":null,"archived":false,"fork":false,"pushed_at":"2024-05-17T07:48:34.000Z","size":12053,"stargazers_count":276,"open_issues_count":16,"forks_count":25,"subscribers_count":1,"default_branch":"master","last_synced_at":"2024-11-08T19:42:46.493Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/chen700564.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-09-04T08:12:45.000Z","updated_at":"2024-11-08T01:44:00.000Z","dependencies_parsed_at":"2024-01-14T13:19:08.664Z","dependency_job_id":"f1be8339-9b7c-451f-a4ee-26a58711a3cd","html_url":"https://github.com/chen700564/RGB","commit_stats":null,"previous_names":["chen700564/rgb"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chen700564%2FRGB","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chen700564%2FRGB/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chen700564%2FRGB/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chen700564%2FRGB/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/chen700564","download_url":"https://codeload.github.com/chen700564/RGB/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":249316005,"owners_count":21249873,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T22:00:59.387Z","updated_at":"2025-04-17T04:32:36.633Z","avatar_url":"https://github.com/chen700564.png","language":"Python","funding_links":[],"categories":["Retrieval Augmented Generation (RAG) Datasets \u003ca id=\"retrieval-augmented-generation-rag-datasets\"\u003e\u003c/a\u003e","Datasets-or-Benchmark","A01_文本生成_文本对话","Evaluation Frameworks","Evaluation Metrics and Benchmarks"],"sub_categories":["Evaluation Datasets \u003ca id=\"evaluation02\"\u003e\u003c/a\u003e","RAG检索增强生成评估","大语言对话模型及数据","Vector Store Tutorials","Comparison Guides"],"readme":"# RGB\n\n- An implementation for [Benchmarking Large Language Models in Retrieval-Augmented Generation](https://arxiv.org/abs/2309.01431) \n\n## News\n\n- \\[2024/03\\]  We refine the retrieved documents and some answers of `en.json` and `zh.json`, and name the new data files as `en_refine.json` and `zh_refine.json`. \n\n## Quick links\n\n* [Environment](#Environment)\n* [Retrieval-Augmented Generation Benchmark](#Retrieval-Augmented)\n* [Evaluation](#Evaluation)\n* [Licence](#Licence)\n\n### Environment\n\n```bash\nconda create -n rgb python=3.10.0\nconda activate rgb\nbash env.sh\n```\n\n### Retrieval-Augmented Generation Benchmark\n\nThe data is putted in `data/`\n\n```text\ndata/\n├── en.json\n├── en_refine.json\n├── en_int.json\n├── en_fact.json\n├── zh.json\n├── zh_refine.json\n├── zh_int.json\n└── zh_fact.json\n```\n\nTo evalute the Information Integration, you should use `zh_int` or `en_int` for Chinese questions or English questions. \n\nTo evalute the Counterfactual Robustness, you should use `zh_fact` or `en_fact` for Chinese questions or English questions. \n\n#### The refined data\n\nWe refine the retrieved documents and some answers of `en.json` and `zh.json`, and name the new data files as `en_refine.json` and `zh_refine.json`:\n\n+ Removing incorrect positive and negative documents\n\n+ Adding some positive documents.\n\n+ Correcting some inaccurate answers.\n\n### Evaluation\n\nFor evaluating ChatGPT, you can run as:\n\n```bash\npython evalue.py \\\n--dataset en \\\n--modelname chatgpt \\\n--temp 0.2 \\\n--noise_rate 0.6 \\\n--api_key YourAPIKEY \\\n--passage_num 5\n```\n\nFor evaluating other models, you can run as:\n\n```bash\npython evalue.py \\\n--dataset en \\\n--modelname chatglm2-6b \\\n--temp 0.2 \\\n--noise_rate 0.6 \\\n--plm THUDM/chatglm-6b \\\n--passage_num 5\n```\n\nYou should change `modelname` and `plm` for different models, where `plm` is the path of model.\n\n`temp` is the temperature of model.\n\n`noise_rate` is rate of noisy documents in inputs.\n\n`passage_num` is number of provided documents for LLM (default is 5).\n\nThe outputs are:\n\n+ all_rate: The accuracy (noise_rate\u003c1) or rejection rate (noise_rate=1)\n+ fact_check_rate: the error detection rates (ED)\n\n---\n\nTo evaluate rejection using ChatGPT, you should first run the `evalue.py` in noise_rate=1 to obtain the generation result, and then run:\n\n```bash\npython reject_evalue.py \\\n--dataset en \\\n--modelname chatglm2-6b \\\n--api_key YourAPIKEY\n```\n\nThe \"reject_rate\" in the outputs are the reject rate (Rej\\*).\n\n---\n\nTo evaluate counterfactual robustness using ChatGPT, you should first run the `evalue.py` in dataset=en_fact/zh_fact to obtain the generation result, and then run:\n\n```bash\npython fact_evalue.py \\\n--dataset en_fact \\\n--modelname chatglm2-6b \\\n--api_key YourAPIKEY\n```\n\nThe \"reject_rate\" in the outputs are the error detection rates (ED\\*). The `correct_rate` in the outputs are the error correction rate (CR)\n\n## License\n\nThe code and data are released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License for Noncommercial use only. Any commercial use should get formal permission first.\n\nShield: [![CC BY-NC-SA 4.0][cc-by-nc-sa-shield]][cc-by-nc-sa]\n\nThis work is licensed under a\n[Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License][cc-by-nc-sa].\n\n[![CC BY-NC-SA 4.0][cc-by-nc-sa-image]][cc-by-nc-sa]\n\n[cc-by-nc-sa]: http://creativecommons.org/licenses/by-nc-sa/4.0/\n[cc-by-nc-sa-image]: https://licensebuttons.net/l/by-nc-sa/4.0/88x31.png\n[cc-by-nc-sa-shield]: https://img.shields.io/badge/License-CC%20BY--NC--SA%204.0-lightgrey.svg\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchen700564%2FRGB","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fchen700564%2FRGB","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchen700564%2FRGB/lists"}