Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/salesforce/xgen
Salesforce open-source LLMs with 8k sequence length.
https://github.com/salesforce/xgen
language-model large-language-models llm nlp
Last synced: 3 months ago
JSON representation
Salesforce open-source LLMs with 8k sequence length.
- Host: GitHub
- URL: https://github.com/salesforce/xgen
- Owner: salesforce
- License: apache-2.0
- Created: 2023-06-23T01:55:52.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-12-20T21:09:56.000Z (11 months ago)
- Last Synced: 2024-08-01T11:12:19.018Z (3 months ago)
- Topics: language-model, large-language-models, llm, nlp
- Language: Python
- Homepage:
- Size: 60.5 KB
- Stars: 713
- Watchers: 12
- Forks: 36
- Open Issues: 9
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Codeowners: CODEOWNERS
- Security: SECURITY.md
Awesome Lists containing this project
- Awesome_Multimodel_LLM - XGen - Salesforce open-source LLMs with 8k sequence length. (Open Source LLM)
- awesome-rainmana - salesforce/xgen - Salesforce open-source LLMs with 8k sequence length. (Python)
README
# XGen
Official research release for the family of **XGen** models (`7B`) by Salesforce AI Research:
*Title*: [Long Sequence Modeling with XGen: A 7B LLM Trained on 8K Input Sequence Length](https://arxiv.org/abs/2309.03450)
*Authors*: [Erik Nijkamp](https://eriknijkamp.com)\*, Tian Xie\*, [Hiroaki Hayashi](https://hiroakih.me/)\*, [Bo Pang](https://scholar.google.com/citations?user=s9fNEVEAAAAJ&hl=en)\*, Congying Xia\*, Chen Xing, Jesse Vig, Semih Yavuz, Philippe Laban, Ben Krause, Senthil Purushwalkam, Tong Niu, Wojciech Kryscinski, Lidiya Murakhovs'ka, Prafulla Kumar Choubey, Alex Fabbri, Ye Liu, Rui Meng, Lifu Tu, Meghana Bhat, [Chien-Sheng Wu](https://jasonwu0731.github.io/), Silvio Savarese, [Yingbo Zhou](https://scholar.google.com/citations?user=H_6RQ7oAAAAJ&hl=en), [Shafiq Rayhan Joty](https://raihanjoty.github.io/), [Caiming Xiong](http://cmxiong.com/).
(* indicates equal contribution)
Correspondence to: [Shafiq Rayhan Joty](mailto:[email protected]), [Caiming Xiong](mailto:[email protected])
## Models
Model cards are published on the HuggingFace Hub:
* [XGen-7B-4K-Base](https://huggingface.co/Salesforce/xgen-7b-4k-base) with support for 4K sequence length.
* [XGen-7B-8K-Base](https://huggingface.co/Salesforce/xgen-7b-8k-base) with support for 8K sequence length.
* [XGen-7B-8k-Inst](https://huggingface.co/Salesforce/xgen-7b-8k-inst) with instruction-finetuning (for research purpose only).The tokenization uses the OpenAI Tiktoken package, which can be installed via `pip`:
```sh
pip install tiktoken
```The models can be used as auto-regressive samplers as follows:
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLMtokenizer = AutoTokenizer.from_pretrained("Salesforce/xgen-7b-8k-base", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("Salesforce/xgen-7b-8k-base", torch_dtype=torch.bfloat16)
inputs = tokenizer("The world is", return_tensors="pt")
sample = model.generate(**inputs, max_length=128)
print(tokenizer.decode(sample[0]))
```## Citation
```bibtex
@misc{XGen,
title={Long Sequence Modeling with XGen: A 7B LLM Trained on 8K Input Sequence Length},
author={Erik Nijkamp, Tian Xie, Hiroaki Hayashi, Bo Pang, Congying Xia, Chen Xing, Jesse Vig, Semih Yavuz, Philippe Laban, Ben Krause, Senthil Purushwalkam, Tong Niu, Wojciech Kryscinski, Lidiya Murakhovs'ka, Prafulla Kumar Choubey, Alex Fabbri, Ye Liu, Rui Meng, Lifu Tu, Meghana Bhat, Chien-Sheng Wu, Silvio Savarese, Yingbo Zhou, Shafiq Rayhan Joty, Caiming Xiong},
howpublished={ArXiv},
year={2023},
url={https://arxiv.org/abs/2309.03450}
}
```