{"id":13526195,"url":"https://github.com/Felixgithub2017/CG-Eval","last_synced_at":"2025-04-01T06:31:25.531Z","repository":{"id":185759196,"uuid":"671356531","full_name":"Felixgithub2017/CG-Eval","owner":"Felixgithub2017","description":"Chinese Generation Evaluation","archived":false,"fork":false,"pushed_at":"2023-08-14T06:54:50.000Z","size":471,"stargazers_count":12,"open_issues_count":2,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-11-02T10:34:12.770Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Felixgithub2017.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2023-07-27T06:19:28.000Z","updated_at":"2024-07-25T23:48:52.000Z","dependencies_parsed_at":"2024-01-13T22:55:04.159Z","dependency_job_id":"bbbb58c1-dbfe-428f-8b15-57f3f71a5bd1","html_url":"https://github.com/Felixgithub2017/CG-Eval","commit_stats":null,"previous_names":["felixgithub2017/cg-eval"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Felixgithub2017%2FCG-Eval","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Felixgithub2017%2FCG-Eval/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Felixgithub2017%2FCG-Eval/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Felixgithub2017%2FCG-Eval/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Felixgithub2017","download_url":"https://codeload.github.com/Felixgithub2017/CG-Eval/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246596947,"owners_count":20802930,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T06:01:26.287Z","updated_at":"2025-04-01T06:31:24.859Z","avatar_url":"https://github.com/Felixgithub2017.png","language":null,"readme":"# CG-Eval\n![image](https://github.com/Felixgithub2017/CG-Eval/assets/26135691/713b8c0b-6c60-46fc-a4d0-fdd7902ab8a9)\n\n## 评测数据集简介\nCG-Eval是甲骨易AI研究院与LanguageX AI Lab联合研发的针对中文大模型生成能力的测试基准。在此项测试中，受测的中文大语言模型需要对科技与工程、人文与社会科学、数学计算、医师资格考试、司法考试、注册会计师考试这六个大科目类别下的55个子科目的11000道不同类型问题做出准确且相关的回答。 我们设计了一套复合的打分系统，对于非计算题，每一道名词解释题和简答题都有标准参考答案，采用多个标准打分然后加权求和。对于计算题目，我们会提取最终计算结果和解题过程，然后综合打分。\n\n数据集包括以下字段\n大科目类别,子科目名称,题目类型, 题目编号,题目文本,题目答案的汉字长度,题目prompt\n\n## 论文及数据集下载\nCG-Eval论文 https://arxiv.org/abs/2308.04823\u003cbr\u003e\nCG-Eval测试数据集下载地址 https://huggingface.co/datasets/Besteasy/CG-Eval\u003cbr\u003e\nCG-Eval自动化评测地址 http://cgeval.besteasy.com/\u003cbr\u003e\n\n## 评测方法\n下载数据集后，请使用“题目prompt”列对应的提示词向模型提问，并在csv文件中增加“回答”列，存放模型的回复。请注意题目的回答要与提示词、问题编号、科目名称对应。 在收集到所有回答后，请将csv文件提交到评测网站 \nhttp://cgeval.besteasy.com/\n\n您需要提交的csv文件应具有以下字段：\n\n大科目类别,子科目名称,题目类型, 题目编号,题目文本,题目答案的汉字长度,题目prompt,回答\n\n网站会自动计算分数，您可以选择是否将分数同步到排行榜。\n\n## Citation\nIf you find the code and testset are useful in your research, please consider citing\n```\n@misc{zeng2023evaluating,\ntitle={Evaluating the Generation Capabilities of Large Chinese Language Models},\nauthor={Hui Zeng and Jingyuan Xue and Meng Hao and Chen Sun and Bin Ning and Na Zhang},\nyear={2023},\neprint={2308.04823},\narchivePrefix={arXiv},\nprimaryClass={cs.CL}\n}\n```\n\n## License\nThe CG-Eval dataset is licensed under a [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License](http://creativecommons.org/licenses/by-nc-sa/4.0/).\n","funding_links":[],"categories":["A01_文本生成_文本对话","📏 评测基准"],"sub_categories":["大语言对话模型及数据","🧩 领域模型"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FFelixgithub2017%2FCG-Eval","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FFelixgithub2017%2FCG-Eval","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FFelixgithub2017%2FCG-Eval/lists"}