Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.
https://github.com/huybery/Awesome-Code-LLM

👨‍💻 An awesome and curated list of best code-LLM for research.
https://github.com/huybery/Awesome-Code-LLM
List: Awesome-Code-LLM
awesome code-generation large-language-models
Last synced: 3 months ago
JSON representation
👨‍💻 An awesome and curated list of best code-LLM for research.
Host: GitHub
URL: https://github.com/huybery/Awesome-Code-LLM
Owner: huybery
License: mit
Created: 2023-07-05T06:42:09.000Z (11 months ago)
Default Branch: main
Last Pushed: 2024-02-21T03:18:42.000Z (3 months ago)
Last Synced: 2024-03-11T22:04:16.267Z (3 months ago)
Topics: awesome, code-generation, large-language-models
Homepage:
Size: 248 KB
Stars: 521
Watchers: 23
Forks: 27
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE
Lists

Awesome-LLM - Awesome-Code-LLM - An awesome and curated list of best code-LLM for research. (Other Papers)
Awesome-LLM?tab=readme-ov-file - Awesome-Code-LLM - An awesome and curated list of best code-LLM for research. (Other Papers)
ultimate-awesome - Awesome-Code-LLM - 👨‍💻 An awesome and curated list of best code-LLM for research. (Other Lists / Julia Lists)
Awesome-DevAI - Awesome-Code-LLM
awesome-llmops - Awesome-Code-LLM - LLM for research. | ![GitHub Badge](https://img.shields.io/github/stars/huybery/Awesome-Code-LLM.svg?style=flat-square) | (Awesome Lists / Profiling)
awesome-stars - Awesome-Code-LLM - LLM for research. | huybery | 712 | (Others)
awesome - huybery/Awesome-Code-LLM - 👨‍💻 An awesome and curated list of best code-LLM for research. (miscellaneous)
README

        


  👨‍💻 Awesome Code LLM

  

    

  

  

    

  

  

    

  



![](code-banner.png)

## 🧵 Table of Contents

- [🧵 Table of Contents](#-table-of-contents)

- [🚀 Leaderboard](#-leaderboard)

- [💡 Evaluation Toolkit](#-evaluation-toolkit)

- [📚 Paper](#-paper)

  - [▶️ Pre-Training](#️-pre-training)

  - [▶️ Instruction Tuning](#️-instruction-tuning)

  - [▶️ Alignment with Feedback](#️-alignment-with-feedback)

  - [▶️ Prompting](#️-prompting)

  - [▶️ Evaluation \& Benchmark](#️-evaluation--benchmark)

  - [▶️ Using LLMs while coding](#️-using-llms-while-coding)

- [🙌 Contributors](#-contributors)

- [Cite as](#cite-as)

- [Acknowledgement](#acknowledgement)

- [Star History](#star-history)

## 🚀 Leaderboard

 Central Leaderboard (Sort by HumanEval Pass@1) 


| Model                    | Params | HumanEval | MBPP | HF                                                            | Source                                                  |

| ------------------------ | ------ | --------- | ---- | ------------------------------------------------------------- | ------------------------------------------------------- |

| GPT-4 + Reflexion        | ?      | 91.0      | 77.1 |                                                               | [paper](https://arxiv.org/abs/2303.11366)               |

| GPT-4 (latest)           | ?      | 84.1      | 80.0 |                                                               | [github](https://github.com/deepseek-ai/DeepSeek-Coder) |

| DeepSeek-Coder-Instruct  | 33B    | 79.3      | 70.0 | [ckpt](https://hf.co/deepseek-ai/deepseek-coder-33b-instruct) | [github](https://github.com/deepseek-ai/DeepSeek-Coder) |

| DeepSeek-Coder-Instruct  | 7B     | 78.6      | 65.4 | [ckpt](https://hf.co/deepseek-ai/deepseek-coder-33b-instruct) | [github](https://github.com/deepseek-ai/DeepSeek-Coder) |

| GPT-3.5-Turbo (latest)   | ?      | 76.2      | 70.8 |                                                               | [github](https://github.com/deepseek-ai/DeepSeek-Coder) |

| Code-Llama               | 34B    | 62.2      | 61.2 |                                                               | [paper](https://arxiv.org/abs/2308.12950)               |

| Pangu-Coder2             | 15B    | 61.6      |      |                                                               | [paper](https://arxiv.org/abs/2307.14936)               |

| WizardCoder-15B          | 15B    | 57.3      | 51.8 | [ckpt](https://hf.co/WizardLM/WizardCoder-15B-V1.0)           | [paper](https://arxiv.org/abs/2306.08568)               |

| Code-Davinci-002         | ?      | 47.0      |      |                                                               | [paper](https://arxiv.org/abs/2107.03374)               |

| StarCoder-15B (Prompted) | 15B    | 40.8      | 49.5 | [ckpt](https://hf.co/bigcode/starcoder)                       | [paper](https://arxiv.org/abs/2305.06161)               |

| PaLM 2-S                 | ?      | 37.6      | 50.0 |                                                               | [paper](https://arxiv.org/abs/2204.02311)               |

| PaLM-Coder-540B          | 540B   | 36.0      | 47.0 |                                                               | [paper](https://arxiv.org/abs/2204.02311)               |

| InstructCodeT5+          | 16B    | 35.0      |      |                                                               | [paper](https://arxiv.org/abs/2305.07922)               |

| StarCoder-15B            | 15B    | 33.6      | 52.7 | [ckpt](https://hf.co/bigcode/starcoder)                       | [paper](https://arxiv.org/abs/2305.06161)               |

| Code-Cushman-001         | ?      | 33.5      | 45.9 |                                                               | [paper](https://arxiv.org/abs/2107.03374)               |

| CodeT5+                  | 16B    | 30.9      |      |                                                               | [paper](https://arxiv.org/abs/2305.07922)               |

| LLaMA2-70B               | 70B    | 29.9      |      | [ckpt](https://hf.co/meta-llama/Llama-2-70b-hf)               | [paper](https://arxiv.org/abs/2307.09288)               |

| CodeGen-16B-Mono         | 16B    | 29.3      | 35.3 |                                                               | [paper](https://arxiv.org/abs/2203.13474)               |

| PaLM-540B                | 540B   | 26.2      | 36.8 |                                                               | [paper](https://arxiv.org/abs/2204.02311)               |

| LLaMA-65B                | 65B    | 23.7      | 37.7 |                                                               | [paper](https://arxiv.org/abs/2302.13971)               |

| CodeGeeX                 | 13B    | 22.9      | 24.4 |                                                               | [paper](https://arxiv.org/abs/2303.17568)               |

| LLaMA-33B                | 33B    | 21.7      | 30.2 |                                                               | [paper](https://arxiv.org/abs/2302.13971)               |

| CodeGen-16B-Multi        | 16B    | 18.3      | 20.9 |                                                               | [paper](https://arxiv.org/abs/2203.13474)               |

| AlphaCode                | 1.1B   | 17.1      |      |                                                               | [paper](https://arxiv.org/abs/2203.07814)               |

| Leaderboard                          | Access                                                                            |

| :----------------------------------: | ----------------------------------------------------------------------------------|

| Big Code Models Leaderboard          | [[Source](https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard)]      |

| BIRD                                 | [[Source](https://bird-bench.github.io)]                                          |

| CanAiCode Leaderboard                | [[Source](https://huggingface.co/spaces/mike-ravkine/can-ai-code-results)]        |

| Coding LLMs Leaderboard              | [[Source](https://leaderboard.tabbyml.com)]                                       |

| CRUXEval Leaderboard                 | [[Source](https://crux-eval.github.io/leaderboard.html)]                          |

| EvalPlus                             | [[Source](https://evalplus.github.io/leaderboard.html)]                           |

| HumanEval.jl                         | [[Source](https://github.com/01-ai/HumanEval.jl)]                                 |

| InfiCoder-Eval                       | [[Source](https://infi-coder.github.io/inficoder-eval)]                           |

| InterCode                            | [[Source](https://intercode-benchmark.github.io)]                                 |

| Program Synthesis Models Leaderboard | [[Source](https://accubits.com/open-source-program-synthesis-models-leaderboard)] |

| Spider                               | [[Source](https://yale-lily.github.io/spider)]                                    |

## 💡 Evaluation Toolkit:

- [bigcode-evaluation-harness](https://github.com/bigcode-project/bigcode-evaluation-harness): A framework for the evaluation of autoregressive code generation language models.

- [code-eval](https://github.com/abacaj/code-eval): A framework for the evaluation of autoregressive code generation language models on HumanEval.

## 📚 Paper

### ▶️ Pre-Training

1. **Evaluating Large Language Models Trained on Code** `Preprint`

  

    [[Paper](https://arxiv.org/abs/2107.03374)] *Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto. et al.* 2021.07

2. **CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis** `ICLR23`

  

    [[Paper](https://arxiv.org/abs/2203.13474)] *Erik Nijkamp, Bo Pang, Hiroaki Hayashi, Lifu Tu, Huan Wang, Yingbo Zhou, Silvio Savarese, Caiming Xiong.* 2022.03

3. **ERNIE-Code: Beyond English-Centric Cross-lingual Pretraining for Programming Languages** `ACL23 (Findings)`

    [[Paper](https://aclanthology.org/2023.findings-acl.676.pdf)][[Repo](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/model_zoo/ernie-code)] *Yekun Chai, Shuohuan Wang, Chao Pang, Yu Sun, Hao Tian, and Hua Wu.* 2022.12

4. **SantaCoder: don't reach for the stars!** `Preprint`

  

    [[Paper](https://arxiv.org/abs/2301.03988)] *Loubna Ben Allal, Raymond Li, Denis Kocetkov, Chenghao Mou, Christopher Akiki, Carlos Munoz Ferrandis, Niklas Muennighoff. et al.* 2023.01

5. **CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Evaluations on HumanEval-X** `Preprint`

  

    [[Paper](https://arxiv.org/abs/2303.17568)] *Qinkai Zheng, Xiao Xia, Xu Zou, Yuxiao Dong, Shan Wang, Yufei Xue, Zihan Wang, Lei Shen, Andi Wang, Yang Li, Teng Su, Zhilin Yang, Jie Tang.* 2023.03

6. **CodeGen2: Lessons for Training LLMs on Programming and Natural Languages** `ICLR23`

  

    [[Paper](https://arxiv.org/abs/2305.02309)] *Erik Nijkamp, Hiroaki Hayashi, Caiming Xiong, Silvio Savarese, Yingbo Zhou.* 2023.05

7. **StarCoder: may the source be with you!** `Preprint`

  

    [[Paper](https://arxiv.org/abs/2305.06161)] *Raymond Li, Loubna Ben Allal, Yangtian Zi, Niklas Muennighoff, Denis Kocetkov, Chenghao Mou. et al.* 2023.05

8. **CodeT5+: Open Code Large Language Models for Code Understanding and Generation** `Preprint`

  

    [[Paper](https://arxiv.org/abs/2305.07922)] *Yue Wang, Hung Le, Akhilesh Deepak Gotmare, Nghi D.Q. Bui, Junnan Li, Steven C.H. Hoi.* 2023.05

9. **Textbooks Are All You Need** `Preprint`

  

    [[Paper](https://arxiv.org/abs/2306.11644)] *Suriya Gunasekar, Yi Zhang, Jyoti Aneja, Caio César Teodoro Mendes, Allie Del Giorno, Sivakanth Gopi. et al.* 2023.06

10. **Code Llama: Open Foundation Models for Code** `Preprint`

  

    [[Paper](https://arxiv.org/abs/2308.12950)] *Baptiste Rozière, Jonas Gehring, Fabian Gloeckle, Sten Sootla, Itai Gat. et al.* 2023.08

### ▶️ Instruction Tuning

1. **WizardCoder: Empowering Code Large Language Models with Evol-Instruct** `Preprint`

  

    [[Paper](https://arxiv.org/abs/2306.08568)] *Ziyang Luo, Can Xu, Pu Zhao, Qingfeng Sun, Xiubo Geng, Wenxiang Hu, Chongyang Tao, Jing Ma, Qingwei Lin, Daxin Jiang.* 2023.07

2. **OctoPack: Instruction Tuning Code Large Language Models** `Preprint`

  

    [[Paper](https://arxiv.org/abs/2308.07124)][[Repo](https://github.com/bigcode-project/octopack)] *Niklas Muennighoff, Qian Liu, Armel Zebaze, Qinkai Zheng, Binyuan Hui, Terry Yue Zhuo, Swayam Singh, Xiangru Tang, Leandro von Werra, Shayne Longpre.* 2023.08

### ▶️ Alignment with Feedback

1. **CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning** `NeurIPS22`

  

    [[Paper](https://arxiv.org/abs/2207.01780)] *Hung Le, Yue Wang, Akhilesh Deepak Gotmare, Silvio Savarese, Steven C.H. Hoi.* 2022.07 

2. **Execution-based Code Generation using Deep Reinforcement Learning** `TMLR23`

  

    [[Paper](https://arxiv.org/abs/2301.13816)] *Parshin Shojaee, Aneesh Jain, Sindhu Tipirneni, Chandan K. Reddy.* 2023.01 

3. **RLTF: Reinforcement Learning from Unit Test Feedback** `Preprint`

  

    [[Paper](https://arxiv.org/abs/2307.04349)] *Jiate Liu, Yiqin Zhu, Kaiwen Xiao, Qiang Fu, Xiao Han, Wei Yang, Deheng Ye.* 2023.07 

4. **PanGu-Coder2: Boosting Large Language Models for Code with Ranking Feedback** `Preprint`

  

    [[Paper](https://arxiv.org/abs/2307.14936)] *Bo Shen, Jiaxin Zhang, Taihong Chen, Daoguang Zan, Bing Geng, An Fu, Muhan Zeng, Ailun Yu, Jichuan Ji, Jingyang Zhao, Yuenan Guo, Qianxiang Wang.* 2023.07 

### ▶️ Prompting

1. **CodeT: Code Generation with Generated Tests** `ICLR23`

  

    [[Paper](https://arxiv.org/abs/2207.10397)] *Bei Chen, Fengji Zhang, Anh Nguyen, Daoguang Zan, Zeqi Lin, Jian-Guang Lou, Weizhu Chen.* 2022.07

2. **Coder Reviewer Reranking for Code Generation** `ICML23`

  

    [[Paper](https://arxiv.org/abs/2211.16490)] *Tianyi Zhang, Tao Yu, Tatsunori B Hashimoto, Mike Lewis, Wen-tau Yih, Daniel Fried, Sida I Wang.* 2022.11

3. **LEVER: Learning to Verify Language-to-Code Generation with Execution** `ICML23`

  

    [[Paper](https://arxiv.org/abs/2302.08468)] *Ansong Ni, Srini Iyer, Dragomir Radev, Ves Stoyanov, Wen-tau Yih, Sida I. Wang, Xi Victoria Lin.* 2023.02

4. **Teaching Large Language Models to Self-Debug** `Preprint`

  

    [[Paper](https://arxiv.org/abs/2304.05128)] *Xinyun Chen, Maxwell Lin, Nathanael Schärli, Denny Zhou.* 2023.06

5. **Demystifying GPT Self-Repair for Code Generation** `Preprint`

  

    [[Paper](https://arxiv.org/abs/2306.09896)] *Theo X. Olausson, Jeevana Priya Inala, Chenglong Wang, Jianfeng Gao, Armando Solar-Lezama.* 2023.06

6. **SelfEvolve: A Code Evolution Framework via Large Language Models** `Preprint`

   

    [[Paper](https://arxiv.org/abs/2306.02907)] *Shuyang Jiang, Yuhao Wang, Yu Wang.* 2023.06

### ▶️ Evaluation & Benchmark

1. **Measuring Coding Challenge Competence With APPS** `NeurIPS21`

    > Named APPS

  

    [[Paper](https://arxiv.org/abs/2108.07732)][[Repo](https://github.com/hendrycks/apps)] *Dan Hendrycks, Steven Basart, Saurav Kadavath, Mantas Mazeika, Akul Arora, Ethan Guo, Collin Burns, Samir Puranik, Horace He, Dawn Song, Jacob Steinhardt.* 2021.05 

2. **Program Synthesis with Large Language Models** `Preprint`

    > Named MBPP

  

    [[Paper](https://arxiv.org/abs/2108.07732)] *Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie Cai, Michael Terry, Quoc Le, Charles Sutton.* 2021.08 

3. **DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation** `ICML23`

    [[Paper](https://arxiv.org/abs/2211.11501)] *Yuhang Lai, Chengxi Li, Yiming Wang, Tianyi Zhang, Ruiqi Zhong, Luke Zettlemoyer, Scott Wen-tau Yih, Daniel Fried, Sida Wang, Tao Yu.* 2022.11 

4. **RepoBench: Benchmarking Repository-Level Code Auto-Completion Systems** `Preprint`

    [[Paper](https://arxiv.org/abs/2306.03091)] *Tianyang Liu, Canwen Xu, Julian McAuley.* 2023.06 

5. **Can ChatGPT replace StackOverflow? A Study on Robustness and Reliability of Large Language Model Code Generation** `Preprint`

    [[Paper](https://arxiv.org/abs/2308.10335)] *Li Zhong, Zilong Wang.* 2023.08

### ▶️ Using LLMs while coding

1.  **Awesome-DevAI: A list of resources about using LLMs while building software** `Awesome`

    [[Repo](https://github.com/continuedev/Awesome-DevAI)] *Ty Dunn, Nate Sesti.* 2023.10

## 🙌 Contributors









This is an active repository and your contributions are always welcome! If you have any question about this opinionated list, do not hesitate to contact me `[email protected]`.

## Cite as

```

@software{awesome-code-llm,

  author = {Binyuan Hui},

  title = {An awesome and curated list of best code-LLM for research},

  howpublished = {\url{https://github.com/huybery/Awesome-Code-LLM}},

  year = 2023,

}

```

## Acknowledgement

This project is inspired by [Awesome-LLM](https://github.com/Hannibal046/Awesome-LLM).

## Star History

[![Star History Chart](https://api.star-history.com/svg?repos=huybery/Awesome-Code-LLM&type=Date)](https://star-history.com/#huybery/Awesome-Code-LLM&Date)

**[⬆ Back to ToC](#table-of-contents)**