{"id":13454887,"url":"https://github.com/deepseek-ai/DeepSeek-Coder-V2","last_synced_at":"2025-03-24T07:32:13.744Z","repository":{"id":244790866,"uuid":"814950657","full_name":"deepseek-ai/DeepSeek-Coder-V2","owner":"deepseek-ai","description":"DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence","archived":false,"fork":false,"pushed_at":"2024-09-24T12:09:45.000Z","size":2109,"stargazers_count":4100,"open_issues_count":40,"forks_count":512,"subscribers_count":53,"default_branch":"main","last_synced_at":"2025-01-30T00:42:59.500Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/deepseek-ai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE-CODE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":"supported_langs.txt","governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-06-14T03:39:37.000Z","updated_at":"2025-01-30T00:28:21.000Z","dependencies_parsed_at":"2024-10-28T22:41:49.692Z","dependency_job_id":null,"html_url":"https://github.com/deepseek-ai/DeepSeek-Coder-V2","commit_stats":null,"previous_names":["deepseek-ai/deepseek-coder-v2"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deepseek-ai%2FDeepSeek-Coder-V2","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deepseek-ai%2FDeepSeek-Coder-V2/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deepseek-ai%2FDeepSeek-Coder-V2/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deepseek-ai%2FDeepSeek-Coder-V2/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/deepseek-ai","download_url":"https://codeload.github.com/deepseek-ai/DeepSeek-Coder-V2/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245227546,"owners_count":20580896,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-31T08:00:58.984Z","updated_at":"2025-03-24T07:32:12.963Z","avatar_url":"https://github.com/deepseek-ai.png","language":null,"funding_links":[],"categories":[":star: Best Gen AI Papers List (June 2024)","Others","Open LLM","Uncategorized","A01_文本生成_文本对话","Repos","Code Models","miscellaneous","🔧 Utilities \u0026 Miscellaneous"],"sub_categories":["Uncategorized","大语言对话模型及数据"],"readme":"\u003c!-- markdownlint-disable first-line-h1 --\u003e\n\u003c!-- markdownlint-disable html --\u003e\n\u003c!-- markdownlint-disable no-duplicate-header --\u003e\n\n\u003cdiv align=\"center\"\u003e\n  \u003cimg src=\"https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/logo.svg?raw=true\" width=\"60%\" alt=\"DeepSeek-V2\" /\u003e\n\u003c/div\u003e\n\u003chr\u003e\n\u003cdiv align=\"center\" style=\"line-height: 1;\"\u003e\n  \u003ca href=\"https://www.deepseek.com/\" target=\"_blank\" style=\"margin: 2px;\"\u003e\n    \u003cimg alt=\"Homepage\" src=\"https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/badge.svg?raw=true\" style=\"display: inline-block; vertical-align: middle;\"/\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://chat.deepseek.com/\" target=\"_blank\" style=\"margin: 2px;\"\u003e\n    \u003cimg alt=\"Chat\" src=\"https://img.shields.io/badge/🤖%20Chat-DeepSeek%20V2-536af5?color=536af5\u0026logoColor=white\" style=\"display: inline-block; vertical-align: middle;\"/\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://huggingface.co/deepseek-ai\" target=\"_blank\" style=\"margin: 2px;\"\u003e\n    \u003cimg alt=\"Hugging Face\" src=\"https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-DeepSeek%20AI-ffc107?color=ffc107\u0026logoColor=white\" style=\"display: inline-block; vertical-align: middle;\"/\u003e\n  \u003c/a\u003e\n\u003c/div\u003e\n\n\u003cdiv align=\"center\" style=\"line-height: 1;\"\u003e\n  \u003ca href=\"https://discord.gg/Tc7c45Zzu5\" target=\"_blank\" style=\"margin: 2px;\"\u003e\n    \u003cimg alt=\"Discord\" src=\"https://img.shields.io/badge/Discord-DeepSeek%20AI-7289da?logo=discord\u0026logoColor=white\u0026color=7289da\" style=\"display: inline-block; vertical-align: middle;\"/\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/qr.jpeg?raw=true\" target=\"_blank\" style=\"margin: 2px;\"\u003e\n    \u003cimg alt=\"Wechat\" src=\"https://img.shields.io/badge/WeChat-DeepSeek%20AI-brightgreen?logo=wechat\u0026logoColor=white\" style=\"display: inline-block; vertical-align: middle;\"/\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://twitter.com/deepseek_ai\" target=\"_blank\" style=\"margin: 2px;\"\u003e\n    \u003cimg alt=\"Twitter Follow\" src=\"https://img.shields.io/badge/Twitter-deepseek_ai-white?logo=x\u0026logoColor=white\" style=\"display: inline-block; vertical-align: middle;\"/\u003e\n  \u003c/a\u003e\n\u003c/div\u003e\n\u003cdiv align=\"center\" style=\"line-height: 1;\"\u003e\n  \u003ca href=\"https://github.com/deepseek-ai/DeepSeek-V2/blob/main/LICENSE-CODE\" style=\"margin: 2px;\"\u003e\n    \u003cimg alt=\"Code License\" src=\"https://img.shields.io/badge/Code_License-MIT-f5de53?\u0026color=f5de53\" style=\"display: inline-block; vertical-align: middle;\"/\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://github.com/deepseek-ai/DeepSeek-V2/blob/main/LICENSE-MODEL\" style=\"margin: 2px;\"\u003e\n    \u003cimg alt=\"Model License\" src=\"https://img.shields.io/badge/Model_License-Model_Agreement-f5de53?\u0026color=f5de53\" style=\"display: inline-block; vertical-align: middle;\"/\u003e\n  \u003c/a\u003e\n\u003c/div\u003e\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"#2-model-downloads\"\u003eModel Download\u003c/a\u003e |\n  \u003ca href=\"#3-evaluation-results\"\u003eEvaluation Results\u003c/a\u003e |\n  \u003ca href=\"#5-api-platform\"\u003eAPI Platform\u003c/a\u003e |\n  \u003ca href=\"#6-how-to-run-locally\"\u003eHow to Use\u003c/a\u003e |\n  \u003ca href=\"#7-license\"\u003eLicense\u003c/a\u003e |\n  \u003ca href=\"#8-citation\"\u003eCitation\u003c/a\u003e\n\u003c/p\u003e\n\n\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://arxiv.org/pdf/2406.11931\"\u003e\u003cb\u003ePaper Link\u003c/b\u003e👁️\u003c/a\u003e\n\u003c/p\u003e\n\n\n# DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence\n\n## 1. Introduction\nWe present DeepSeek-Coder-V2, an open-source Mixture-of-Experts (MoE) code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks. Specifically, DeepSeek-Coder-V2 is further pre-trained from an intermediate checkpoint of DeepSeek-V2 with additional 6 trillion tokens. Through this continued pre-training, DeepSeek-Coder-V2 substantially enhances the coding and mathematical reasoning capabilities of DeepSeek-V2, while maintaining comparable performance in general language tasks. Compared to DeepSeek-Coder-33B, DeepSeek-Coder-V2 demonstrates significant advancements in various aspects of code-related tasks, as well as reasoning and general capabilities. Additionally, DeepSeek-Coder-V2 expands its support for programming languages from 86 to 338, while extending the context length from 16K to 128K. \n\n\u003cp align=\"center\"\u003e\n  \u003cimg width=\"100%\" src=\"figures/performance.png\"\u003e\n\u003c/p\u003e\n\n\nIn standard benchmark evaluations, DeepSeek-Coder-V2 achieves superior performance compared to closed-source models such as GPT4-Turbo, Claude 3 Opus, and Gemini 1.5 Pro in coding and math benchmarks.  The list of supported programming languages can be found [here](supported_langs.txt).\n\n## 2. Model Downloads\n\nWe release the DeepSeek-Coder-V2 with 16B and 236B parameters based on the [DeepSeekMoE](https://arxiv.org/pdf/2401.06066) framework, which has actived parameters of only 2.4B and 21B , including base and instruct models, to the public. \n\n\u003cdiv align=\"center\"\u003e\n\n|            **Model**            | **#Total Params** | **#Active Params** | **Context Length** |                         **Download**                         |\n| :-----------------------------: | :---------------: | :----------------: | :----------------: | :----------------------------------------------------------: |\n|   DeepSeek-Coder-V2-Lite-Base   |        16B        |        2.4B        |        128k        | [🤗 HuggingFace](https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Lite-Base) |\n| DeepSeek-Coder-V2-Lite-Instruct |        16B        |        2.4B        |        128k        | [🤗 HuggingFace](https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct) |\n|     DeepSeek-Coder-V2-Base      |       236B        |        21B         |        128k        | [🤗 HuggingFace](https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Base) |\n|   DeepSeek-Coder-V2-Instruct    |       236B        |        21B         |        128k        | [🤗 HuggingFace](https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Instruct) |\n\n\u003c/div\u003e\n\n\n\n## 3. Evaluation Results\n### 3.1 Code Generation\n\n\n|  | #TP | #AP | HumanEval | MBPP+ | LiveCodeBench | USACO |\n|:------------|:--------:|:--------:|:--------:|:--------:|:--------:|:-----------:|\n| **Closed-Source Models** |  |  |  |  |  |  |\n| **Gemini-1.5-Pro**                  |  -   |  -   |   83.5    | **74.6**  |   34.1   |   4.9    |\n| **Claude-3-Opus**                   |  -   |  -   |   84.2    |   72.0    |   34.6   |   7.8    |\n| **GPT-4-Turbo-1106**                |  -   |  -   |   87.8    |   69.3    |   37.1   |   11.1   |\n| **GPT-4-Turbo-0409**                |  -   |  -   |   88.2    |   72.2    | **45.7** |   12.3   |\n| **GPT-4o-0513**                     |  -   |  -   | **91.0** |   73.5    |   43.4   | **18.8** |\n| **Open-Source Models**              |      |      |           |           |          |          |\n| **CodeStral**                       | 22B  | 22B  |   78.1    |   68.2    |   31.0   |   4.6    |\n| **DeepSeek-Coder-Instruct**         | 33B  | 33B  |   79.3    |   70.1   |   22.5   |   4.2    |\n| **Llama3-Instruct**                 | 70B  | 70B  |   81.1    |   68.8   |   28.7   |   3.3    |\n| **DeepSeek-Coder-V2-Lite-Instruct** | 16B | 2.4B | 81.1 | 68.8 | 24.3 | 6.5 |\n| **DeepSeek-Coder-V2-Instruct** | 236B | 21B  | **90.2** | **76.2** | **43.4** | **12.1** |\n\n### 3.2 Code Completion\n\n\n| Model                           | #TP  | #AP  | RepoBench (Python) | RepoBench (Java) | HumanEval FIM |\n| :------------------------------ | :--: | :--: | :----------------: | :--------------: | :-----------: |\n| **CodeStral**                   | 22B  | 22B  |      **46.1**      |     **45.7**     |     83.0      |\n| **DeepSeek-Coder-Base**         |  7B  |  7B  |        36.2        |       43.3       |     86.1      |\n| **DeepSeek-Coder-Base**         | 33B  | 33B  |        39.1        |       44.8       |   **86.4**    |\n| **DeepSeek-Coder-V2-Lite-Base** | 16B  | 2.4B |        38.9        |       43.3       |   **86.4**    |\n\n### 3.3 Code Fixing\n\n\n|                                     | #TP  | #AP  | Defects4J | SWE-Bench |  Aider   |\n| ----------------------------------- | :--: | :--: | :-------: | :-------: | :------: |\n| **Closed-Source Models**            |      |      |           |           |          |\n| **Gemini-1.5-Pro**                  |  -   |  -   |   18.6    |   19.3    |   57.1   |\n| **Claude-3-Opus**                   |  -   |  -   |   25.5    |   11.7    |   68.4   |\n| **GPT-4-Turbo-1106**                |  -   |  -   |   22.8    |   22.7    |   65.4   |\n| **GPT-4-Turbo-0409**                |  -   |  -   |   24.3    |   18.3    |   63.9   |\n| **GPT-4o-0513**                     |  -   |  -   | **26.1**  | **26.7**  | **72.9** |\n| **Open-Source Models**              |      |      |           |           |          |\n| **CodeStral**                       | 22B  | 22B  |   17.8    |    2.7    |   51.1   |\n| **DeepSeek-Coder-Instruct**         | 33B  | 33B  |   11.3    |    0.0    |   54.5   |\n| **Llama3-Instruct**                 | 70B  | 70B  |   16.2    |     -     |   49.2   |\n| **DeepSeek-Coder-V2-Lite-Instruct** | 16B  | 2.4B |    9.2    |    0.0    |   44.4   |\n| **DeepSeek-Coder-V2-Instruct**      | 236B | 21B  | **21.0**  | **12.7**  | **73.7** |\n\n### 3.4 Mathematical Reasoning\n\n\n|                                     | #TP  | #AP  |  GSM8K   |   MATH   | AIME 2024 | Math Odyssey |\n| ----------------------------------- | :--: | :--: | :------: | :------: | :-------: | :----------: |\n| **Closed-Source Models**            |      |      |          |          |           |              |\n| **Gemini-1.5-Pro**                  |  -   |  -   |   90.8   |   67.7   |   2/30    |     45.0     |\n| **Claude-3-Opus**                   |  -   |  -   |   95.0   |   60.1   |   2/30    |     40.6     |\n| **GPT-4-Turbo-1106**                |  -   |  -   |   91.4   |   64.3   |   1/30    |     49.1     |\n| **GPT-4-Turbo-0409**                |  -   |  -   |   93.7   |   73.4   | **3/30**  |     46.8     |\n| **GPT-4o-0513**                     |  -   |  -   | **95.8** | **76.6** |   2/30    |   **53.2**   |\n| **Open-Source Models**              |      |      |          |          |           |              |\n| **Llama3-Instruct**                 | 70B  | 70B  |   93.0   |   50.4   |   1/30    |     27.9     |\n| **DeepSeek-Coder-V2-Lite-Instruct** | 16B  | 2.4B |   86.4   |   61.8   |   0/30    |     44.4     |\n| **DeepSeek-Coder-V2-Instruct**      | 236B | 21B  | **94.9** | **75.7** | **4/30**  |   **53.7**   |\n\n### 3.5 General Natural Language\n\n|      Benchmark       | Domain  | DeepSeek-V2-Lite Chat | DeepSeek-Coder-V2-Lite Instruct | DeepSeek-V2 Chat | DeepSeek-Coder-V2 Instruct |\n| :------------------: | :-----: | :-------------------: | :-----------------------------: | :--------------: | :------------------------: |\n|       **BBH**        | English |         48.1          |              61.2               |       79.7       |          **83.9**          |\n|       **MMLU**       | English |         55.7          |              60.1               |       78.1       |          **79.2**          |\n|     **ARC-Easy**     | English |         86.1          |              88.9               |     **98.1**     |            97.4            |\n|  **ARC-Challenge**   | English |         73.4          |              77.4               |       92.3       |          **92.8**          |\n|     **TriviaQA**     | English |         65.2          |              59.5               |     **86.7**     |            82.3            |\n| **NaturalQuestions** | English |         35.5          |              30.8               |     **53.4**     |            47.5            |\n|     **AGIEval**      | English |         42.8          |              28.7               |     **61.4**     |             60             |\n|     **CLUEWSC**      | Chinese |         80.0          |              76.5               |     **89.9**     |            85.9            |\n|      **C-Eval**      | Chinese |         60.1          |              61.6               |       78.0       |          **79.4**          |\n|      **CMMLU**       | Chinese |         62.5          |              62.7               |     **81.6**     |            80.9            |\n|    **Arena-Hard**    |    -    |         11.4          |              38.1               |       41.6       |          **65.0**          |\n|  **AlpaceEval 2.0**  |    -    |         16.9          |              17.7               |     **38.9**     |            36.9            |\n|     **MT-Bench**     |    -    |         7.37          |              7.81               |     **8.97**     |            8.77            |\n|    **Alignbench**    |    -    |         6.02          |              6.83               |     **7.91**     |            7.84            |\n\n### 3.6 Context Window\n\n\u003cp align=\"center\"\u003e\n  \u003cimg width=\"80%\" src=\"figures/long_context.png\"\u003e\n\u003c/p\u003e\n\n\nEvaluation results on the ``Needle In A Haystack`` (NIAH) tests.  DeepSeek-Coder-V2 performs well across all context window lengths up to **128K**. \n\n## 4. Chat Website\n\nYou can chat with the DeepSeek-Coder-V2 on DeepSeek's official website: [coder.deepseek.com](https://coder.deepseek.com/sign_in)\n\n## 5. API Platform\nWe also provide OpenAI-Compatible API at DeepSeek Platform: [platform.deepseek.com](https://platform.deepseek.com/), and you can also pay-as-you-go at an unbeatable price.\n\n\u003cp align=\"center\"\u003e\n  \u003cimg width=\"40%\" src=\"figures/model_price.jpg\"\u003e\n\u003c/p\u003e\n\n\n## 6. How to run locally\n**Here, we provide some examples of how to use DeepSeek-Coder-V2-Lite model. If you want to utilize DeepSeek-Coder-V2 in BF16 format for inference, 80GB*8 GPUs are required.**\n\n### Inference with Huggingface's Transformers\nYou can directly employ [Huggingface's Transformers](https://github.com/huggingface/transformers) for model inference.\n\n#### Code Completion\n```python\nfrom transformers import AutoTokenizer, AutoModelForCausalLM\nimport torch\ntokenizer = AutoTokenizer.from_pretrained(\"deepseek-ai/DeepSeek-Coder-V2-Lite-Base\", trust_remote_code=True)\nmodel = AutoModelForCausalLM.from_pretrained(\"deepseek-ai/DeepSeek-Coder-V2-Lite-Base\", trust_remote_code=True, torch_dtype=torch.bfloat16).cuda()\ninput_text = \"#write a quick sort algorithm\"\ninputs = tokenizer(input_text, return_tensors=\"pt\").to(model.device)\noutputs = model.generate(**inputs, max_length=128)\nprint(tokenizer.decode(outputs[0], skip_special_tokens=True))\n```\n\n#### Code Insertion\n```python\nfrom transformers import AutoTokenizer, AutoModelForCausalLM\nimport torch\ntokenizer = AutoTokenizer.from_pretrained(\"deepseek-ai/DeepSeek-Coder-V2-Lite-Base\", trust_remote_code=True)\nmodel = AutoModelForCausalLM.from_pretrained(\"deepseek-ai/DeepSeek-Coder-V2-Lite-Base\", trust_remote_code=True, torch_dtype=torch.bfloat16).cuda()\ninput_text = \"\"\"\u003c｜fim▁begin｜\u003edef quick_sort(arr):\n    if len(arr) \u003c= 1:\n        return arr\n    pivot = arr[0]\n    left = []\n    right = []\n\u003c｜fim▁hole｜\u003e\n        if arr[i] \u003c pivot:\n            left.append(arr[i])\n        else:\n            right.append(arr[i])\n    return quick_sort(left) + [pivot] + quick_sort(right)\u003c｜fim▁end｜\u003e\"\"\"\ninputs = tokenizer(input_text, return_tensors=\"pt\").to(model.device)\noutputs = model.generate(**inputs, max_length=128)\nprint(tokenizer.decode(outputs[0], skip_special_tokens=True)[len(input_text):])\n```\n\n#### Chat Completion\n\n```python\nfrom transformers import AutoTokenizer, AutoModelForCausalLM\nimport torch\ntokenizer = AutoTokenizer.from_pretrained(\"deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct\", trust_remote_code=True)\nmodel = AutoModelForCausalLM.from_pretrained(\"deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct\", trust_remote_code=True, torch_dtype=torch.bfloat16).cuda()\nmessages=[\n    { 'role': 'user', 'content': \"write a quick sort algorithm in python.\"}\n]\ninputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors=\"pt\").to(model.device)\n# tokenizer.eos_token_id is the id of \u003c｜end▁of▁sentence｜\u003e token\noutputs = model.generate(inputs, max_new_tokens=512, do_sample=False, top_k=50, top_p=0.95, num_return_sequences=1, eos_token_id=tokenizer.eos_token_id)\nprint(tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True))\n```\n\n\n\nThe complete chat template can be found within `tokenizer_config.json` located in the huggingface model repository.\n\nAn example of chat template is as belows:\n\n```bash\n\u003c｜begin▁of▁sentence｜\u003eUser: {user_message_1}\n\nAssistant: {assistant_message_1}\u003c｜end▁of▁sentence｜\u003eUser: {user_message_2}\n\nAssistant:\n```\n\nYou can also add an optional system message:\n\n```bash\n\u003c｜begin▁of▁sentence｜\u003e{system_message}\n\nUser: {user_message_1}\n\nAssistant: {assistant_message_1}\u003c｜end▁of▁sentence｜\u003eUser: {user_message_2}\n\nAssistant:\n```\n\nIn the last round of dialogue, note that \"Assistant:\" has no space after the colon. Adding a space might cause the following issues on the 16B-Lite model:\n- English questions receiving Chinese responses.\n- Responses containing garbled text.\n- Responses repeating excessively.\n\nOlder versions of Ollama had this bug (see https://github.com/deepseek-ai/DeepSeek-Coder-V2/issues/12), but it has been fixed in the latest version.\n\n\n### Inference with SGLang (recommended)\n[SGLang](https://github.com/sgl-project/sglang) currently supports MLA optimizations, FP8 (W8A8), FP8 KV Cache, and Torch Compile, offering the best latency and throughput among open-source frameworks. Here are some example commands to launch an OpenAI API-compatible server:\n\n```bash\n# BF16, tensor parallelism = 8\npython3 -m sglang.launch_server --model deepseek-ai/DeepSeek-Coder-V2-Instruct --tp 8 --trust-remote-code\n\n# BF16, w/ torch.compile (The compilation can take several minutes)\npython3 -m sglang.launch_server --model deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct --trust-remote-code --enable-torch-compile\n\n# FP8, tensor parallelism = 8, FP8 KV cache\npython3 -m sglang.launch_server --model neuralmagic/DeepSeek-Coder-V2-Instruct-FP8 --tp 8 --trust-remote-code --kv-cache-dtype fp8_e5m2\n```\n\nAfter launching the server, you can query it with OpenAI API\n\n```\nimport openai\nclient = openai.Client(\n    base_url=\"http://127.0.0.1:30000/v1\", api_key=\"EMPTY\")\n\n# Chat completion\nresponse = client.chat.completions.create(\n    model=\"default\",\n    messages=[\n        {\"role\": \"system\", \"content\": \"You are a helpful AI assistant\"},\n        {\"role\": \"user\", \"content\": \"List 3 countries and their capitals.\"},\n    ],\n    temperature=0,\n    max_tokens=64,\n)\nprint(response)\n```\n\n\n### Inference with vLLM (recommended)\nTo utilize [vLLM](https://github.com/vllm-project/vllm) for model inference, please merge this Pull Request into your vLLM codebase: https://github.com/vllm-project/vllm/pull/4650.\n\n```python\nfrom transformers import AutoTokenizer\nfrom vllm import LLM, SamplingParams\n\nmax_model_len, tp_size = 8192, 1\nmodel_name = \"deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct\"\ntokenizer = AutoTokenizer.from_pretrained(model_name)\nllm = LLM(model=model_name, tensor_parallel_size=tp_size, max_model_len=max_model_len, trust_remote_code=True, enforce_eager=True)\nsampling_params = SamplingParams(temperature=0.3, max_tokens=256, stop_token_ids=[tokenizer.eos_token_id])\n\nmessages_list = [\n    [{\"role\": \"user\", \"content\": \"Who are you?\"}],\n    [{\"role\": \"user\", \"content\": \"write a quick sort algorithm in python.\"}],\n    [{\"role\": \"user\", \"content\": \"Write a piece of quicksort code in C++.\"}],\n]\n\nprompt_token_ids = [tokenizer.apply_chat_template(messages, add_generation_prompt=True) for messages in messages_list]\n\noutputs = llm.generate(prompt_token_ids=prompt_token_ids, sampling_params=sampling_params)\n\ngenerated_text = [output.outputs[0].text for output in outputs]\nprint(generated_text)\n```\n\n\n\n## 7. License\n\nThis code repository is licensed under [the MIT License](LICENSE-CODE). The use of DeepSeek-Coder-V2 Base/Instruct models is subject to [the Model License](LICENSE-MODEL). DeepSeek-Coder-V2 series (including Base and Instruct) supports commercial use.\n\n## 8. Citation\n```latex\n@article{zhu2024deepseek,\n  title={DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence},\n  author={Zhu, Qihao and Guo, Daya and Shao, Zhihong and Yang, Dejian and Wang, Peiyi and Xu, Runxin and Wu, Y and Li, Yukun and Gao, Huazuo and Ma, Shirong and others},\n  journal={arXiv preprint arXiv:2406.11931},\n  year={2024}\n}\n```\n\n## 9. Contact\nIf you have any questions, please raise an issue or contact us at [service@deepseek.com](service@deepseek.com).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdeepseek-ai%2FDeepSeek-Coder-V2","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdeepseek-ai%2FDeepSeek-Coder-V2","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdeepseek-ai%2FDeepSeek-Coder-V2/lists"}