Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/OpenLemur/Lemur

[ICLR 2024] Lemur: Open Foundation Models for Language Agents
https://github.com/OpenLemur/Lemur

code-generation language-model machine-learning natural-language-processing nlp text-reasoning

Last synced: 29 days ago
JSON representation

[ICLR 2024] Lemur: Open Foundation Models for Language Agents

Awesome Lists containing this project

README

        

# [Lemur: Open Foundation Models for Language Agents](https://arxiv.org/abs/2310.06830)


Lemur



Models


Blog


Paper


Stars


Open Issues


Twitter Follow


Join Slack


Discord

Lemur is an openly accessible language model optimized for both natural language and coding capabilities to serve as the backbone of versatile language agents.
As language models continue to evolve from conversational chatbots to functional agents that can act in the real world, they need both strong language understanding and the ability to execute actions. Lemur balances natural language and coding skills to enable agents to follow instructions, reason for tasks, and take grounded actions.



Please refer to our paper and code for more details:
- [[Paper](https://arxiv.org/abs/2310.06830)] Lemur: Harmonizing Natural Language and Code for Language Agents
- [[Blog](https://www.xlang.ai/blog/openlemur)] Introducing Lemur: Open Foundation Models for Language Agents

## 🔥 News
* **[18 October, 2023]:** 🎉 We open-sourced [code for OpenAgents](https://github.com/xlang-ai/OpenAgents): An Open Platform for Language Agents in the Wild.
* **[11 October, 2023]:** 🎉 We released the research paper and codebase. We will continue updating this repository.
* **[23 August, 2023]:** 🎉 We released the weights of [`OpenLemur/lemur-70b-v1`](https://huggingface.co/OpenLemur/lemur-70b-v1), and [`OpenLemur/lemur-70b-chat-v1`](https://huggingface.co/OpenLemur/lemur-70b-chat-v1)! Check it out in [HuggingFace Hub](https://huggingface.co/OpenLemur).

## Models
We released our models on the HuggingFace Hub:
* [OpenLemur/lemur-70b-v1](https://huggingface.co/OpenLemur/lemur-70b-v1)
* [OpenLemur/lemur-70b-chat-v1](https://huggingface.co/OpenLemur/lemur-70b-chat-v1)

## Table of Contents
- [Lemur: Open Foundation Models for Language Agents](#lemur-open-foundation-models-for-language-agents)
- [🔥 News](#-news)
- [Models](#models)
- [Table of Contents](#table-of-contents)
- [Why Lemur?](#why-lemur)
- [Quickstart](#quickstart)
- [Setup](#setup)
- [Lemur-70B](#lemur-70b)
- [Lemur-70B-Chat](#lemur-70b-chat)
- [Training](#training)
- [Evaluation](#evaluation)
- [Foundational Abilities](#foundational-abilities)
- [Interactive Agent Skills](#interactive-agent-skills)
- [Deploy](#deploy)
- [MINT](#mint)
- [WebArena](#webarena)
- [InterCode](#intercode)
- [Citation](#citation)
- [Acknowledgements](#acknowledgements)

## Why Lemur?
Most existing open-source models specialize in either natural language or code. Lemur combines both strengths by:

- Pretraining Llama-2-70B on a 90B token corpus with the 10:1 ratio of code to text and obtaining Lemur-70B-v1
- Instruction tuning Lemur-70B-v1 on 300K examples covering both text and code and obtaining Lemur-70B-Chat-v1

This two-stage training produces state-of-the-art performance averaged across diverse language and coding benchmarks, surpassing other available open-source models and narrowing the gap between open-source and commercial models on agent abilities.

## Quickstart

### Setup
First, we have to install all the libraries listed in `requirements.txt`

```bash
conda create -n xchat python=3.10
conda activate xchat
conda install pytorch==2.0.1 pytorch-cuda=11.8 -c pytorch -c nvidia
conda install -c "nvidia/label/cuda-11.8.0" cuda-nvcc
```
Then, install the xchat package:
```bash
git clone [email protected]:OpenLemur/Lemur.git
cd Lemur
pip install -e .
```

### Lemur-70B
For the base model lemur-70b-v1, you can use it in this way:

Click me

```python
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("OpenLemur/lemur-70b-v1")
model = AutoModelForCausalLM.from_pretrained("OpenLemur/lemur-70b-v1", device_map="auto", load_in_8bit=True)

# Text Generation Example
prompt = "The world is "
input = tokenizer(prompt, return_tensors="pt")
output = model.generate(**input, max_length=50, num_return_sequences=1)
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)

# Code Generation Example
prompt = """
def factorial(n):
if n == 0:
return 1
"""
input = tokenizer(prompt, return_tensors="pt")
output = model.generate(**input, max_length=200, num_return_sequences=1)
generated_code = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_code)
```

### Lemur-70B-Chat
We instruction-finetune lemur-70b-v1 model with ChatML format to obtain lemur-70b-chat-v1. You can use lemur-70b-chat-v1 in this way:

Click me

```python
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("OpenLemur/lemur-70b-chat-v1")
model = AutoModelForCausalLM.from_pretrained("OpenLemur/lemur-70b-chat-v1", device_map="auto", load_in_8bit=True)

# Text Generation Example
prompt = """<|im_start|>system
You are a helpful, respectful, and honest assistant.
<|im_end|>
<|im_start|>user
What's a lemur's favorite fruit?<|im_end|>
<|im_start|>assistant
"""
input = tokenizer(prompt, return_tensors="pt")
output = model.generate(**input, max_length=50, num_return_sequences=1)
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)

# Code Generation Example
prompt = """<|im_start|>system
Below is an instruction that describes a task. Write a response that appropriately completes the request.
<|im_end|>
<|im_start|>user
Write a Python function to merge two sorted lists into one sorted list without using any built-in sort functions.<|im_end|>
<|im_start|>assistant
"""
input = tokenizer(prompt, return_tensors="pt")
output = model.generate(**input, max_length=200, num_return_sequences=1)
generated_code = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_code)
```

## Training



## Evaluation
We evaluated Lemur across:
- 8 language and code datasets like MMLU, BBH, GSM8K, HumanEval, and Spider to validate balanced capabilities
- 13 interactive agent datasets to test skills like tool usage, adapting to feedback from environments or humans, and exploring partially observable digital or physical environments.





### Foundational Abilities
We build the evaluation suite based on [open-instruct](https://github.com/allenai/open-instruct). We will keep updating more tasks and models.

Currently, we support the following tasks:
- [✅] [MMLU](./scripts/eval/mmlu.sh)
- [✅] [BBH](./scripts/eval/bbh.sh)
- [✅] [GSM8K](./scripts/eval/gsm8k.sh)
- [✅] [HumanEval](./scripts/eval/human_eval.sh)
- [✅] [MBPP](./scripts/eval/mbpp.sh)
- [🚧] [Spider]()
- [🚧] [MultiPL-E]()
- [🚧] [DS-1000]()
- [🚧] ...

### Interactive Agent Skills
We use the evaluation frameworks provided by [MINT](https://github.com/xingyaoww/mint-bench), [InterCode](https://github.com/princeton-nlp/intercode), and [WebArena](https://github.com/web-arena-x/webarena) to evaluate interactive agent skills.

#### Deploy
We use vLLM to serve the Lemur model. However, the official FastChat codebase does not yet support Lemur-Chat. Therefore, we provide a docker to serve vLLM for Lemur. Please refer to [vllm_lemur.sh](./scripts/deploy/vllm_lemur.sh) for more detailed information.

```bash
bash scripts/deploy/vllm_lemur.sh
```

#### MINT
We [fork MINT](https://github.com/OpenLemur/mint-bench) codebase to share the configs we used. Please refer to [this config folder](https://github.com/OpenLemur/mint-bench/tree/main/configs) for more details. Please run vllm with [`vllm_lemur.sh`](./scripts/deploy/vllm_lemur.sh) script.

#### WebArena
We [fork WebArena](https://github.com/OpenLemur/webarena) codebase to enable vLLM evaluation. To run the evaluation on WebArena, please refer to our [forked WebArena codebase](https://github.com/OpenLemur/webarena).

#### InterCode
We [fork InterCode](https://github.com/OpenLemur/intercode) codebase and do modifications to enable Lemur evaluation. Please refer to [this script folder](https://github.com/OpenLemur/intercode/tree/master/scripts) for more details. Please run `text-generation-inference` with [`tgi_lemur.sh`](scripts/deploy/tgi_lemur.sh) script.

## Citation
If you find our work helpful, please cite us:
```
@misc{xu2023lemur,
title={Lemur: Harmonizing Natural Language and Code for Language Agents},
author={Yiheng Xu and Hongjin Su and Chen Xing and Boyu Mi and Qian Liu and Weijia Shi and Binyuan Hui and Fan Zhou and Yitao Liu and Tianbao Xie and Zhoujun Cheng and Siheng Zhao and Lingpeng Kong and Bailin Wang and Caiming Xiong and Tao Yu},
year={2023},
eprint={2310.06830},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```

## Acknowledgements

The Lemur project is an open collaborative research effort between [XLang Lab](https://www.xlang.ai/) and [Salesforce Research](https://www.salesforceairesearch.com/). We thank the following institutions for their gift support:


Salesforce Research


Google Research


Amazon AWS