Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/salesforce/CodeGen
CodeGen is a family of open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.
https://github.com/salesforce/CodeGen
codex generativemodel languagemodel llm programsynthesis tpu-acceleration
Last synced: 5 days ago
JSON representation
CodeGen is a family of open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.
- Host: GitHub
- URL: https://github.com/salesforce/CodeGen
- Owner: salesforce
- License: apache-2.0
- Created: 2022-03-28T20:48:29.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2024-03-17T22:00:24.000Z (8 months ago)
- Last Synced: 2024-10-29T15:35:06.910Z (6 days ago)
- Topics: codex, generativemodel, languagemodel, llm, programsynthesis, tpu-acceleration
- Language: Python
- Homepage:
- Size: 1.35 MB
- Stars: 4,918
- Watchers: 81
- Forks: 380
- Open Issues: 43
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE.txt
- Code of conduct: CODE_OF_CONDUCT.md
- Codeowners: CODEOWNERS
- Security: SECURITY.md
Awesome Lists containing this project
- awesome-ai-coding - CodeGen 350M/2B/6B/16B
- awesome-coding-assistants - CodeGen
- awesome-llmops - CodeGen - source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex. | ![GitHub Badge](https://img.shields.io/github/stars/salesforce/CodeGen.svg?style=flat-square) | (Code AI / Vector search)
- ai-game-devtools - CodeGen - source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex. |[arXiv](https://arxiv.org/abs/2203.13474) | | Code | (<span id="code">Code</span> / <span id="tool">Tool (AI LLM)</span>)
- awesome-code-ai - Salesforce CodeGen - source) (Code completion LLMs)
- StarryDivineSky - salesforce/CodeGen - v4 训练。与 OpenAI Codex 竞争。 (文本生成、文本对话 / 大语言对话模型及数据)
- my-awesome - salesforce/CodeGen - acceleration pushed_at:2024-03 star:4.9k fork:0.4k CodeGen is a family of open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex. (Python)
README
# CodeGen
Official release for the **CodeGen1** and **CodeGen2** models (`350M`, `1B`, `3B`, `7B` `16B`) for **Program Synthesis** by [Salesforce AI Research](https://www.salesforceairesearch.com/).
## News
**July 2023**
[**CodeGen2.5**](https://github.com/salesforce/CodeGen/tree/main/codegen25) released outperforming 16B parameter models with only 7B.
**May 2023**
**CodeGen2.0** released with strong infill sampling capability.
**March 2022**
**CodeGen1.0** released on par with OpenAI Codex at the time.
## Publications
[CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis](https://arxiv.org/abs/2203.13474)
[Erik Nijkamp](https://enijkamp.github.io/)\*, [Bo Pang](https://scholar.google.com/citations?user=s9fNEVEAAAAJ&hl=en)\*, [Hiroaki Hayashi](https://hiroakih.me/)\*, [Lifu Tu](https://home.ttic.edu/~lifu/), [Huan Wang](https://scholar.google.com/citations?user=7NpTttkAAAAJ&hl=en), [Yingbo Zhou](https://scholar.google.com/citations?user=H_6RQ7oAAAAJ&hl=en), [Silvio Savarese](https://scholar.google.com/citations?user=ImpbxLsAAAAJ&hl=en), and [Caiming Xiong](https://scholar.google.com/citations?user=vaSdahkAAAAJ&hl=en)
ICLR, 2023[CodeGen2: Lessons for Training LLMs on Programming and Natural Languages](https://arxiv.org/abs/2305.02309)
[Erik Nijkamp](https://enijkamp.github.io/)\*, [Hiroaki Hayashi](https://hiroakih.me/)\*, [Caiming Xiong](https://scholar.google.com/citations?user=vaSdahkAAAAJ&hl=en), [Silvio Savarese](https://scholar.google.com/citations?user=ImpbxLsAAAAJ&hl=en), and [Yingbo Zhou](https://scholar.google.com/citations?user=H_6RQ7oAAAAJ&hl=en)
ICLR, 2023## Usage
The models are available on the [Hugging Face Hub](https://huggingface.co/models?search=salesforce+codegen).
**CodeGen1.0**
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLMtokenizer = AutoTokenizer.from_pretrained("Salesforce/codegen-2B-mono")
model = AutoModelForCausalLM.from_pretrained("Salesforce/codegen-2B-mono")
inputs = tokenizer("# this function prints hello world", return_tensors="pt")
sample = model.generate(**inputs, max_length=128)
print(tokenizer.decode(sample[0], truncate_before_pattern=[r"\n\n^#", "^'''", "\n\n\n"]))
```**CodeGen2.0**
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLMtokenizer = AutoTokenizer.from_pretrained("Salesforce/codegen2-7B")
model = AutoModelForCausalLM.from_pretrained("Salesforce/codegen2-7B", trust_remote_code=True, revision="main")
inputs = tokenizer("# this function prints hello world", return_tensors="pt")
sample = model.generate(**inputs, max_length=128)
print(tokenizer.decode(sample[0], truncate_before_pattern=[r"\n\n^#", "^'''", "\n\n\n"]))
```**CodeGen2.5**
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLMtokenizer = AutoTokenizer.from_pretrained("Salesforce/codegen25-7b-mono", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("Salesforce/codegen25-7b-mono")
inputs = tokenizer("# this function prints hello world", return_tensors="pt")
sample = model.generate(**inputs, max_length=128)
print(tokenizer.decode(sample[0]))
```## Training
The Jaxformer library for data pre-processing, training and fine-tuning the CodeGen models can be found here:
https://github.com/salesforce/jaxformer
## Citation
If you find our code or paper useful, please cite the paper:
```bibtex
@article{nijkamp2022codegen,
title={CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis},
author={Nijkamp, Erik and Pang, Bo and Hayashi, Hiroaki and Tu, Lifu and Wang, Huan and Zhou, Yingbo and Savarese, Silvio and Xiong, Caiming},
journal={ICLR},
year={2023}
}@article{nijkamp2023codegen2,
title={CodeGen2: Lessons for Training LLMs on Programming and Natural Languages},
author={Nijkamp, Erik and Hayashi, Hiroaki and Xiong, Caiming and Savarese, Silvio and Zhou, Yingbo},
journal={ICLR},
year={2023}
}
```