{"id":13631850,"url":"https://github.com/WisdomShell/codeshell","last_synced_at":"2025-04-17T22:32:16.233Z","repository":{"id":200338778,"uuid":"694958306","full_name":"WisdomShell/codeshell","owner":"WisdomShell","description":"A series of code large language models developed by PKU-KCL","archived":false,"fork":false,"pushed_at":"2024-07-18T10:20:27.000Z","size":1700,"stargazers_count":1613,"open_issues_count":46,"forks_count":120,"subscribers_count":25,"default_branch":"main","last_synced_at":"2024-10-29T15:36:06.974Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"http://se.pku.edu.cn/kcl","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/WisdomShell.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"License.pdf","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-09-22T03:43:58.000Z","updated_at":"2024-10-24T08:30:58.000Z","dependencies_parsed_at":"2024-02-28T10:41:42.031Z","dependency_job_id":"1dcd46e8-4f40-409a-b2a0-5a0e8594bda6","html_url":"https://github.com/WisdomShell/codeshell","commit_stats":null,"previous_names":["wisdomshell/codeshell"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WisdomShell%2Fcodeshell","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WisdomShell%2Fcodeshell/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WisdomShell%2Fcodeshell/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WisdomShell%2Fcodeshell/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/WisdomShell","download_url":"https://codeload.github.com/WisdomShell/codeshell/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":223768663,"owners_count":17199357,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T22:02:40.916Z","updated_at":"2024-11-08T23:31:44.154Z","avatar_url":"https://github.com/WisdomShell.png","language":"Python","readme":"\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"https://cdn-uploads.huggingface.co/production/uploads/6489a27bd0b2fd1f3297e5ca/3LQsqRzluBhBN2DipN6Ox.png\" width=\"400\"/\u003e\n\u003cp\u003e\n\n\u003cp align=\"center\"\u003e\n  🤗 \u003ca href=\"https://huggingface.co/WisdomShell\" target=\"_blank\"\u003eHugging Face\u003c/a\u003e • 🤖 \u003ca href=\"https://modelscope.cn/organization/WisdomShell\" target=\"_blank\"\u003eModelScope\u003c/a\u003e • ⭕️ \u003ca href=\"https://www.wisemodel.cn/models/WisdomShell/CodeShell-7B\" target=\"_blank\"\u003eWiseModel\u003c/a\u003e • 🌐 \u003ca href=\"http://se.pku.edu.cn/kcl/\" target=\"_blank\"\u003ePKU-KCL\u003c/a\u003e \n\u003c/p\u003e\n\n\u003cdiv align=\"center\"\u003e\n\n[![license](https://img.shields.io/github/license/modelscope/modelscope.svg)](https://github.com/WisdomShell/codeshell/blob/main/License.pdf)\n\u003ch4 align=\"center\"\u003e\n    \u003cp\u003e\u003ca href=\"https://github.com/WisdomShell/codeshell/blob/main/README.md\"\u003e\u003cb\u003e中文\u003c/b\u003e\u003c/a\u003e|\u003ca href=\"https://github.com/WisdomShell/codeshell/blob/main/README_EN.md\"\u003eEnglish\u003c/a\u003e\u003c/p\u003e\n\u003c/h4\u003e\n\u003c/div\u003e\n\n## Introduction\n\nCodeShell是[北京大学知识计算实验室](http://se.pku.edu.cn/kcl/)联合四川天府银行AI团队研发的多语言代码大模型基座。CodeShell具有70亿参数，在五千亿Tokens进行了训练，上下文窗口长度为8192。在权威的代码评估Benchmark（HumanEval与MBPP）上，CodeShell取得同等规模最好的性能。与此同时，我们提供了与CodeShell配套的部署方案与IDE插件，请参考代码库[CodeShell](https://github.com/WisdomShell/codeshell)。同时，为了方便中国用户下载，我们在[Modelscope](https://modelscope.cn/organization/WisdomShell)和[Wisemodel](https://www.wisemodel.cn/models/WisdomShell/CodeShell-7B/)中也上传了对应版本，国内用户可以访问。\n\n\n本次开源的模型如下：\n\n- \u003ca href=\"https://huggingface.co/WisdomShell/CodeShell\" target=\"_blank\"\u003e\u003cb\u003eCodeShell Base\u003c/b\u003e\u003c/a\u003e：CodelShell底座模型，具有强大的代码基础能力。\n- \u003ca href=\"https://huggingface.co/WisdomShell/CodeShell-Chat\" target=\"_blank\"\u003e\u003cb\u003eCodeShell Chat\u003c/b\u003e\u003c/a\u003e：CodelShell对话模型，在代码问答、代码补全等下游任务重性能优异。\n- \u003ca href=\"https://huggingface.co/WisdomShell/CodeShell-Chat-int4\" target=\"_blank\"\u003e\u003cb\u003eCodeShell Chat 4bit\u003c/b\u003e\u003c/a\u003e：CodelShell对话模型4bit量化版本，在保证模型性能的前提下内存消耗更小，速度更快。\n- \u003ca href=\"https://github.com/WisdomShell/llama_cpp_for_codeshell\" target=\"_blank\"\u003e\u003cb\u003eCodeShell CPP\u003c/b\u003e\u003c/a\u003e：CodelShell对话模型CPP版本，支持开发者在没有GPU的个人电脑中使用。注意，CPP版本同样支持量化操作，用户可以在最小内存为8G的个人电脑中运行CodeShell。\n\n\n## Main Characteristics of CodeShell\n\n- **强大的性能**：CodelShell在HumanEval和MBPP上达到了7B代码基座大模型的最优性能\n- **完整的体系**：除了代码大模型，同时开源IDE（VS Code与JetBrains）插件，形成开源的全栈技术体系\n- **轻量化部署**：支持本地C++部署，提供轻量快速的本地化软件开发助手解决方案\n- **全面的评测**：提供支持完整项目上下文、覆盖代码生成、代码缺陷检测与修复、测试用例生成等常见软件开发活动的多任务评测体系（即将开源）\n- **高效的训练**：基于高效的数据治理体系，CodeShell在完全冷启动情况下，只训练了五千亿Token即获得了优异的性能\n\n## Performance\n\n我们选取了目前最流行的两个代码评测数据集（HumanEval与MBPP）对模型进行评估，与目前最先进的两个7b代码大模型CodeLlama与Starcoder相比，Codeshell 取得了最优的成绩。具体评测结果如下。\n\n|   任务   |  CodeShell-7b | CodeLlama-7b | Starcoder-7b |\n| ------- | --------- | --------- | --------- |\n| humaneval\t | **34.32** | 29.44 | 27.80 |\n| mbpp\t\t | **38.65** | 37.60 | 34.16 |\n| multiple-js\t | **33.17** | 31.30 | 27.02 |\n| multiple-java\t | **30.43** | 29.24 | 24.30 |\n| multiple-cpp\t | **28.21** | 27.33 | 23.04 |\n| multiple-swift | 24.30 | **25.32** | 15.70 |\n| multiple-php\t | **30.87** | 25.96 | 22.11 |\n| multiple-d\t | 8.85 | **11.60** | 8.08 |\n| multiple-jl\t | 22.08 | **25.28** | 22.96 |\n| multiple-lua\t | 22.39 | **30.50** | 22.92 |\n| multiple-r\t | **20.52** | 18.57 | 14.29 |\n| multiple-rkt\t | **17.20** | 12.55 | 10.43 |\n| multiple-rs\t | 24.55 | **25.90** | 22.82 |\n\n## Requirements\n\n- python 3.8 and above\n- pytorch 2.0 and above are recommended\n- transformers 4.32 and above\n- CUDA 11.8 and above are recommended (this is for GPU users, flash-attention users, etc.)\n\n## Quickstart\n\nCodeShell系列模型已经上传至 \u003ca href=\"https://huggingface.co/WisdomShell/CodeShell\" target=\"_blank\"\u003eHugging Face\u003c/a\u003e，开发者可以通过Transformers快速调用CodeShell和CodeShell-Chat。\n\n在开始之前，请确保已经正确设置了环境，并安装了必要的代码包，以及满足上一小节的环境要求。你可以通过下列代码快速安装相关依赖。\n\n```\npip install -r requirements.txt\n```\n\n接下来你可以通过Transformers使用CodeShell。\n\n### Code Generation\n\n开发者可以使用CodeShell快速生成代码，加速开发效率。\n\n```python\nimport torch\nfrom transformers import AutoModelForCausalLM, AutoTokenizer\n\ndevice = 'cuda' if torch.cuda.is_available() else 'cpu'\ntokenizer = AutoTokenizer.from_pretrained(\"WisdomShell/CodeShell-7B\")\nmodel = AutoModelForCausalLM.from_pretrained(\"WisdomShell/CodeShell-7B\", trust_remote_code=True, torch_dtype=torch.bfloat16).to(device)\ninputs = tokenizer('def merge_sort():', return_tensors='pt').to(device)\noutputs = model.generate(**inputs)\nprint(tokenizer.decode(outputs[0]))\n```\n\n- Fill in the Moddle\n\nCodeShell 支持Fill-in-the-Middle模式，从而更好的支持软件开发过程。\n\n```python\ninput_text = \"\u003cfim_prefix\u003edef print_hello_world():\\n    \u003cfim_suffix\u003e\\n    print('Hello world!')\u003cfim_middle\u003e\"\ninputs = tokenizer(input_text, return_tensors='pt').to(device)\noutputs = model.generate(**inputs)\nprint(tokenizer.decode(outputs[0]))\n```\n\n- 代码问答\n\nCodeShell同时开源了代码助手模型CodeShell-7B-Chat，开发者可以通过下列代码与模型进行交互。\n\n```python\nmodel = AutoModelForCausalLM.from_pretrained('WisdomShell/CodeShell-7B-Chat', trust_remote_code=True, torch_dtype=torch.bfloat16).to(device)\ntokenizer = AutoTokenizer.from_pretrained('WisdomShell/CodeShell-7B-Chat')\n\nhistory = []\nquery = '你是谁?'\nresponse = model.chat(query, history, tokenizer)\nprint(response)\nhistory.append((query, response))\n\nquery = '用Python写一个HTTP server'\nresponse = model.chat(query, history, tokenizer)\nprint(response)\nhistory.append((query, response))\n```\n\n开发者也可以通过VS Code与JetBrains插件与CodeShell-7B-Chat交互，详情请参[VSCode插件仓库](https://github.com/WisdomShell/codeshell-vscode)与[IntelliJ插件仓库](https://github.com/WisdomShell/codeshell-intellij)。\n\n\n- Model Quantization\n\nCodeShell 支持4 bit/8 bit量化，4 bit量化后，占用显存大小约6G，用户可以在显存较小的GPU上使用CodeShell。\n\n```python\nmodel = AutoModelForCausalLM.from_pretrained('WisdomShell/CodeShell-7B-Chat-int4', trust_remote_code=True).to(device)\ntokenizer = AutoTokenizer.from_pretrained('WisdomShell/CodeShell-7B-Chat-int4')\n```\n\n- CodeShell in c/c++\n\n由于大部分个人电脑没有GPU，CodeShell提供了C/C++版本的推理支持，开发者可以根据本地环境进行编译与使用，详见[CodeShell C/C++本地化版](https://github.com/WisdomShell/llama_cpp_for_codeshell)。\n\n## Demo\n\n我们提供了Web-UI、命令行、API、IDE四种形式的Demo。\n\n### Web UI\n\n开发者通过下列命令启动Web服务，服务启动后，可以通过`https://127.0.0.1:8000`进行访问。\n\n```\npython demos/web_demo.py\n```\n\n### CLI Demo\n\n我们也提供了命令行交互的Demo版本，开发者可以通过下列命令运行。\n\n```\npython demos/cli_demo.py\n```\n\n### API\n\nCodeShell也提供了基于OpenAI API的部署方法。\n\n```\npython demos/openai_api.py\n```\n\n启动后即可通过HTTP请求与CodeShell交互。\n\n```\ncurl http://127.0.0.1:8000/v1/chat/completions \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"model\": \"CodeShell-7B-Chat\",\n    \"messages\": [\n      {\n        \"role\": \"user\",\n        \"content\": \"你好\"\n      }\n    ]\n  }'\n```\n\n### IDE\n\nCodeShell最后提供了线上IDE，开发者可以通过IDE进行代码补全、代码问答等操作。同时，IDE插件也同时发布，开发者可以自行在本地进行安装使用。插件相关问题欢迎在[VSCode插件仓库](https://github.com/WisdomShell/codeshell-vscode)与[IntelliJ插件仓库](https://github.com/WisdomShell/codeshell-intellij)中讨论。\n\n## Model Details\n\nCode Shell使用GPT-2作为基础架构，采用Grouped-Query Attention、RoPE相对位置编码等技术。\n\n### Hyper-parameter\n\n| Hyper-parameter | Value  |\n|---|---|\n| n_layer | 42 |\n| n_embd | 4096 |\n| n_inner | 16384 |\n| n_head | 32 |\n| num_query_groups | 8 |\n| seq-length | 8192 |\n| vocab_size | 70144 |\n\n\n### Data\n\nCodeShell基于自己爬取的Github数据、Big Code开源的Stack和StarCoder数据集、以及少量高质量的中英文数据进行训练。在原始数据集的基础上，CodeShell采用基于Minihash对数据去重，基于KenLM以及高质量数据筛选模型对数据进行了过滤与筛选，最终得到高质量的预训练数据集。\n\n### Tokenizer\n\nCodeShell基于Starcoder词表进行了优化，去除了使用频率较低的词语，并添加了部分中文词表，显著提升了中文的压缩率，为Chat版本的训练提供了基础。\n\n\n| Tokenizer | Size | Chinese  | English | Code | Total|\n|---|---|---|---|---|---|\n| Starcoder | 49152 | 1.22 | 3.47 | 3.30 | 2.66 |\n| CodeShell | 70020 | 1.50 | 3.47 | 3.30 | 2.95 |\n\n\n## License\n\n社区使用CodeShell模型需要遵循[《CodeShell模型许可协议》](https://github.com/WisdomShell/codeshell/blob/main/License.pdf)及[Apache 2.0许可协议](https://www.apache.org/licenses/LICENSE-2.0)。CodeShell模型允许用于商业用途，但如果您计划将CodeShell模型或其派生产品用于商业用途，需要您确认主体符合以下条件：\n\n1. 关联方的服务或产品的每日平均活跃用户数（DAU）不能超过100万。\n2. 关联方不得是软件服务提供商或云服务提供商。\n3. 关联方不存在将获得授予的商业许可，在未经许可的前提下将其再授权给其他第三方的可能性。\n\n在满足上述条件的前提下，您需要通过向codeshell.opensource@gmail.com发送电子邮件，提交《CodeShell模型许可协议》要求的申请材料。经审核通过后，将授予您一个全球的、非排他的、不可转让的、不可再授权的商业版权许可。\n\n\n## Star History\n\n[![Star History Chart](https://api.star-history.com/svg?repos=WisdomShell/codeshell\u0026type=Date)](https://star-history.com/#WisdomShell/codeshell\u0026Date)\n\n","funding_links":[],"categories":["Python","A01_文本生成_文本对话","大模型列表","App"],"sub_categories":["大语言对话模型及数据"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FWisdomShell%2Fcodeshell","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FWisdomShell%2Fcodeshell","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FWisdomShell%2Fcodeshell/lists"}