Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/huybery/Awesome-Code-LLM

👨‍💻 An awesome and curated list of best code-LLM for research.
https://github.com/huybery/Awesome-Code-LLM

List: Awesome-Code-LLM

awesome code-generation large-language-models

Last synced: 3 months ago
JSON representation

👨‍💻 An awesome and curated list of best code-LLM for research.

Lists

README

        


👨‍💻 Awesome Code LLM



Awesome


PRs Welcome


Last Commit

![](code-banner.png)

## 🧵 Table of Contents

- [🧵 Table of Contents](#-table-of-contents)
- [🚀 Leaderboard](#-leaderboard)
- [💡 Evaluation Toolkit](#-evaluation-toolkit)
- [📚 Paper](#-paper)
- [▶️ Pre-Training](#️-pre-training)
- [▶️ Instruction Tuning](#️-instruction-tuning)
- [▶️ Alignment with Feedback](#️-alignment-with-feedback)
- [▶️ Prompting](#️-prompting)
- [▶️ Evaluation \& Benchmark](#️-evaluation--benchmark)
- [▶️ Using LLMs while coding](#️-using-llms-while-coding)
- [🙌 Contributors](#-contributors)
- [Cite as](#cite-as)
- [Acknowledgement](#acknowledgement)
- [Star History](#star-history)

## 🚀 Leaderboard

Central Leaderboard (Sort by HumanEval Pass@1)

| Model | Params | HumanEval | MBPP | HF | Source |
| ------------------------ | ------ | --------- | ---- | ------------------------------------------------------------- | ------------------------------------------------------- |
| GPT-4 + Reflexion | ? | 91.0 | 77.1 | | [paper](https://arxiv.org/abs/2303.11366) |
| GPT-4 (latest) | ? | 84.1 | 80.0 | | [github](https://github.com/deepseek-ai/DeepSeek-Coder) |
| DeepSeek-Coder-Instruct | 33B | 79.3 | 70.0 | [ckpt](https://hf.co/deepseek-ai/deepseek-coder-33b-instruct) | [github](https://github.com/deepseek-ai/DeepSeek-Coder) |
| DeepSeek-Coder-Instruct | 7B | 78.6 | 65.4 | [ckpt](https://hf.co/deepseek-ai/deepseek-coder-33b-instruct) | [github](https://github.com/deepseek-ai/DeepSeek-Coder) |
| GPT-3.5-Turbo (latest) | ? | 76.2 | 70.8 | | [github](https://github.com/deepseek-ai/DeepSeek-Coder) |
| Code-Llama | 34B | 62.2 | 61.2 | | [paper](https://arxiv.org/abs/2308.12950) |
| Pangu-Coder2 | 15B | 61.6 | | | [paper](https://arxiv.org/abs/2307.14936) |
| WizardCoder-15B | 15B | 57.3 | 51.8 | [ckpt](https://hf.co/WizardLM/WizardCoder-15B-V1.0) | [paper](https://arxiv.org/abs/2306.08568) |
| Code-Davinci-002 | ? | 47.0 | | | [paper](https://arxiv.org/abs/2107.03374) |
| StarCoder-15B (Prompted) | 15B | 40.8 | 49.5 | [ckpt](https://hf.co/bigcode/starcoder) | [paper](https://arxiv.org/abs/2305.06161) |
| PaLM 2-S | ? | 37.6 | 50.0 | | [paper](https://arxiv.org/abs/2204.02311) |
| PaLM-Coder-540B | 540B | 36.0 | 47.0 | | [paper](https://arxiv.org/abs/2204.02311) |
| InstructCodeT5+ | 16B | 35.0 | | | [paper](https://arxiv.org/abs/2305.07922) |
| StarCoder-15B | 15B | 33.6 | 52.7 | [ckpt](https://hf.co/bigcode/starcoder) | [paper](https://arxiv.org/abs/2305.06161) |
| Code-Cushman-001 | ? | 33.5 | 45.9 | | [paper](https://arxiv.org/abs/2107.03374) |
| CodeT5+ | 16B | 30.9 | | | [paper](https://arxiv.org/abs/2305.07922) |
| LLaMA2-70B | 70B | 29.9 | | [ckpt](https://hf.co/meta-llama/Llama-2-70b-hf) | [paper](https://arxiv.org/abs/2307.09288) |
| CodeGen-16B-Mono | 16B | 29.3 | 35.3 | | [paper](https://arxiv.org/abs/2203.13474) |
| PaLM-540B | 540B | 26.2 | 36.8 | | [paper](https://arxiv.org/abs/2204.02311) |
| LLaMA-65B | 65B | 23.7 | 37.7 | | [paper](https://arxiv.org/abs/2302.13971) |
| CodeGeeX | 13B | 22.9 | 24.4 | | [paper](https://arxiv.org/abs/2303.17568) |
| LLaMA-33B | 33B | 21.7 | 30.2 | | [paper](https://arxiv.org/abs/2302.13971) |
| CodeGen-16B-Multi | 16B | 18.3 | 20.9 | | [paper](https://arxiv.org/abs/2203.13474) |
| AlphaCode | 1.1B | 17.1 | | | [paper](https://arxiv.org/abs/2203.07814) |

| Leaderboard | Access |
| :----------------------------------: | ----------------------------------------------------------------------------------|
| Big Code Models Leaderboard | [[Source](https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard)] |
| BIRD | [[Source](https://bird-bench.github.io)] |
| CanAiCode Leaderboard | [[Source](https://huggingface.co/spaces/mike-ravkine/can-ai-code-results)] |
| Coding LLMs Leaderboard | [[Source](https://leaderboard.tabbyml.com)] |
| CRUXEval Leaderboard | [[Source](https://crux-eval.github.io/leaderboard.html)] |
| EvalPlus | [[Source](https://evalplus.github.io/leaderboard.html)] |
| HumanEval.jl | [[Source](https://github.com/01-ai/HumanEval.jl)] |
| InfiCoder-Eval | [[Source](https://infi-coder.github.io/inficoder-eval)] |
| InterCode | [[Source](https://intercode-benchmark.github.io)] |
| Program Synthesis Models Leaderboard | [[Source](https://accubits.com/open-source-program-synthesis-models-leaderboard)] |
| Spider | [[Source](https://yale-lily.github.io/spider)] |

## 💡 Evaluation Toolkit:

- [bigcode-evaluation-harness](https://github.com/bigcode-project/bigcode-evaluation-harness): A framework for the evaluation of autoregressive code generation language models.
- [code-eval](https://github.com/abacaj/code-eval): A framework for the evaluation of autoregressive code generation language models on HumanEval.

## 📚 Paper

### ▶️ Pre-Training

1. **Evaluating Large Language Models Trained on Code** `Preprint`

[[Paper](https://arxiv.org/abs/2107.03374)] *Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto. et al.* 2021.07

2. **CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis** `ICLR23`

[[Paper](https://arxiv.org/abs/2203.13474)] *Erik Nijkamp, Bo Pang, Hiroaki Hayashi, Lifu Tu, Huan Wang, Yingbo Zhou, Silvio Savarese, Caiming Xiong.* 2022.03

3. **ERNIE-Code: Beyond English-Centric Cross-lingual Pretraining for Programming Languages** `ACL23 (Findings)`

[[Paper](https://aclanthology.org/2023.findings-acl.676.pdf)][[Repo](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/model_zoo/ernie-code)] *Yekun Chai, Shuohuan Wang, Chao Pang, Yu Sun, Hao Tian, and Hua Wu.* 2022.12

4. **SantaCoder: don't reach for the stars!** `Preprint`

[[Paper](https://arxiv.org/abs/2301.03988)] *Loubna Ben Allal, Raymond Li, Denis Kocetkov, Chenghao Mou, Christopher Akiki, Carlos Munoz Ferrandis, Niklas Muennighoff. et al.* 2023.01

5. **CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Evaluations on HumanEval-X** `Preprint`

[[Paper](https://arxiv.org/abs/2303.17568)] *Qinkai Zheng, Xiao Xia, Xu Zou, Yuxiao Dong, Shan Wang, Yufei Xue, Zihan Wang, Lei Shen, Andi Wang, Yang Li, Teng Su, Zhilin Yang, Jie Tang.* 2023.03

6. **CodeGen2: Lessons for Training LLMs on Programming and Natural Languages** `ICLR23`

[[Paper](https://arxiv.org/abs/2305.02309)] *Erik Nijkamp, Hiroaki Hayashi, Caiming Xiong, Silvio Savarese, Yingbo Zhou.* 2023.05

7. **StarCoder: may the source be with you!** `Preprint`

[[Paper](https://arxiv.org/abs/2305.06161)] *Raymond Li, Loubna Ben Allal, Yangtian Zi, Niklas Muennighoff, Denis Kocetkov, Chenghao Mou. et al.* 2023.05

8. **CodeT5+: Open Code Large Language Models for Code Understanding and Generation** `Preprint`

[[Paper](https://arxiv.org/abs/2305.07922)] *Yue Wang, Hung Le, Akhilesh Deepak Gotmare, Nghi D.Q. Bui, Junnan Li, Steven C.H. Hoi.* 2023.05

9. **Textbooks Are All You Need** `Preprint`

[[Paper](https://arxiv.org/abs/2306.11644)] *Suriya Gunasekar, Yi Zhang, Jyoti Aneja, Caio César Teodoro Mendes, Allie Del Giorno, Sivakanth Gopi. et al.* 2023.06

10. **Code Llama: Open Foundation Models for Code** `Preprint`

[[Paper](https://arxiv.org/abs/2308.12950)] *Baptiste Rozière, Jonas Gehring, Fabian Gloeckle, Sten Sootla, Itai Gat. et al.* 2023.08

### ▶️ Instruction Tuning

1. **WizardCoder: Empowering Code Large Language Models with Evol-Instruct** `Preprint`

[[Paper](https://arxiv.org/abs/2306.08568)] *Ziyang Luo, Can Xu, Pu Zhao, Qingfeng Sun, Xiubo Geng, Wenxiang Hu, Chongyang Tao, Jing Ma, Qingwei Lin, Daxin Jiang.* 2023.07

2. **OctoPack: Instruction Tuning Code Large Language Models** `Preprint`

[[Paper](https://arxiv.org/abs/2308.07124)][[Repo](https://github.com/bigcode-project/octopack)] *Niklas Muennighoff, Qian Liu, Armel Zebaze, Qinkai Zheng, Binyuan Hui, Terry Yue Zhuo, Swayam Singh, Xiangru Tang, Leandro von Werra, Shayne Longpre.* 2023.08

### ▶️ Alignment with Feedback

1. **CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning** `NeurIPS22`

[[Paper](https://arxiv.org/abs/2207.01780)] *Hung Le, Yue Wang, Akhilesh Deepak Gotmare, Silvio Savarese, Steven C.H. Hoi.* 2022.07

2. **Execution-based Code Generation using Deep Reinforcement Learning** `TMLR23`

[[Paper](https://arxiv.org/abs/2301.13816)] *Parshin Shojaee, Aneesh Jain, Sindhu Tipirneni, Chandan K. Reddy.* 2023.01

3. **RLTF: Reinforcement Learning from Unit Test Feedback** `Preprint`

[[Paper](https://arxiv.org/abs/2307.04349)] *Jiate Liu, Yiqin Zhu, Kaiwen Xiao, Qiang Fu, Xiao Han, Wei Yang, Deheng Ye.* 2023.07

4. **PanGu-Coder2: Boosting Large Language Models for Code with Ranking Feedback** `Preprint`

[[Paper](https://arxiv.org/abs/2307.14936)] *Bo Shen, Jiaxin Zhang, Taihong Chen, Daoguang Zan, Bing Geng, An Fu, Muhan Zeng, Ailun Yu, Jichuan Ji, Jingyang Zhao, Yuenan Guo, Qianxiang Wang.* 2023.07

### ▶️ Prompting

1. **CodeT: Code Generation with Generated Tests** `ICLR23`

[[Paper](https://arxiv.org/abs/2207.10397)] *Bei Chen, Fengji Zhang, Anh Nguyen, Daoguang Zan, Zeqi Lin, Jian-Guang Lou, Weizhu Chen.* 2022.07

2. **Coder Reviewer Reranking for Code Generation** `ICML23`

[[Paper](https://arxiv.org/abs/2211.16490)] *Tianyi Zhang, Tao Yu, Tatsunori B Hashimoto, Mike Lewis, Wen-tau Yih, Daniel Fried, Sida I Wang.* 2022.11

3. **LEVER: Learning to Verify Language-to-Code Generation with Execution** `ICML23`

[[Paper](https://arxiv.org/abs/2302.08468)] *Ansong Ni, Srini Iyer, Dragomir Radev, Ves Stoyanov, Wen-tau Yih, Sida I. Wang, Xi Victoria Lin.* 2023.02

4. **Teaching Large Language Models to Self-Debug** `Preprint`

[[Paper](https://arxiv.org/abs/2304.05128)] *Xinyun Chen, Maxwell Lin, Nathanael Schärli, Denny Zhou.* 2023.06

5. **Demystifying GPT Self-Repair for Code Generation** `Preprint`

[[Paper](https://arxiv.org/abs/2306.09896)] *Theo X. Olausson, Jeevana Priya Inala, Chenglong Wang, Jianfeng Gao, Armando Solar-Lezama.* 2023.06

6. **SelfEvolve: A Code Evolution Framework via Large Language Models** `Preprint`

[[Paper](https://arxiv.org/abs/2306.02907)] *Shuyang Jiang, Yuhao Wang, Yu Wang.* 2023.06

### ▶️ Evaluation & Benchmark

1. **Measuring Coding Challenge Competence With APPS** `NeurIPS21`

> Named APPS

[[Paper](https://arxiv.org/abs/2108.07732)][[Repo](https://github.com/hendrycks/apps)] *Dan Hendrycks, Steven Basart, Saurav Kadavath, Mantas Mazeika, Akul Arora, Ethan Guo, Collin Burns, Samir Puranik, Horace He, Dawn Song, Jacob Steinhardt.* 2021.05

2. **Program Synthesis with Large Language Models** `Preprint`

> Named MBPP

[[Paper](https://arxiv.org/abs/2108.07732)] *Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie Cai, Michael Terry, Quoc Le, Charles Sutton.* 2021.08

3. **DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation** `ICML23`

[[Paper](https://arxiv.org/abs/2211.11501)] *Yuhang Lai, Chengxi Li, Yiming Wang, Tianyi Zhang, Ruiqi Zhong, Luke Zettlemoyer, Scott Wen-tau Yih, Daniel Fried, Sida Wang, Tao Yu.* 2022.11

4. **RepoBench: Benchmarking Repository-Level Code Auto-Completion Systems** `Preprint`

[[Paper](https://arxiv.org/abs/2306.03091)] *Tianyang Liu, Canwen Xu, Julian McAuley.* 2023.06

5. **Can ChatGPT replace StackOverflow? A Study on Robustness and Reliability of Large Language Model Code Generation** `Preprint`

[[Paper](https://arxiv.org/abs/2308.10335)] *Li Zhong, Zilong Wang.* 2023.08

### ▶️ Using LLMs while coding

1. **Awesome-DevAI: A list of resources about using LLMs while building software** `Awesome`

[[Repo](https://github.com/continuedev/Awesome-DevAI)] *Ty Dunn, Nate Sesti.* 2023.10

## 🙌 Contributors




This is an active repository and your contributions are always welcome! If you have any question about this opinionated list, do not hesitate to contact me `[email protected]`.

## Cite as

```
@software{awesome-code-llm,
author = {Binyuan Hui},
title = {An awesome and curated list of best code-LLM for research},
howpublished = {\url{https://github.com/huybery/Awesome-Code-LLM}},
year = 2023,
}
```

## Acknowledgement

This project is inspired by [Awesome-LLM](https://github.com/Hannibal046/Awesome-LLM).

## Star History

[![Star History Chart](https://api.star-history.com/svg?repos=huybery/Awesome-Code-LLM&type=Date)](https://star-history.com/#huybery/Awesome-Code-LLM&Date)

**[⬆ Back to ToC](#table-of-contents)**