An open API service indexing awesome lists of open source software.

https://github.com/tensorzero/llmgym


https://github.com/tensorzero/llmgym

Last synced: 3 months ago
JSON representation

Awesome Lists containing this project

README

        

> [!IMPORTANT]
>
> **This repository is still under active development. Expect breaking changes.**

# LLM Gym

LLM Gym is a unified environment interface for developing and benchmarking LLM applications that learn from feedback. Think [gym](https://gymnasium.farama.org/) for LLM agents.

As the space of benchmarks rapidly grows, fair and comprehensive comparisons are getting trickier, so we aim to make that easier for you. The vision is an intuitive interface for a suite of environments you can seamlessly swap out for research and development purposes.

## Installation

Follow these steps to set up the development environment for LLM Gym using uv for virtual environment management and Hatch (with Hatchling) for building and packaging.

### Prerequisites

- Python 3.10 (or a compatible version, e.g., >=3.10, <4.0)
- [uv](https://docs.astral.sh/uv/getting-started/installation/) – an extremely fast Python package manager and virtual environment tool

### Steps

#### 1. Clone the Repository
Clone the repository to your local machine:
```bash
git clone [email protected]:tensorzero/gym-scratchpad.git
cd llmgym
```

#### 2. Create and Activate a Virtual Environment
Use uv to create a virtual environment. This command will create a new environment (by default in the .venv directory) using Python 3.10:
```bash
uv venv --python 3.10
```
Activate the virtual environment:
```bash
source .venv/bin/activate
```

#### 3. Install Project Dependencies
Install the project in editable mode along with its development dependencies:
```bash
uv pip install -e .
```

#### 4. Verify the Installation
To ensure everything is set up correctly, you can run the tests or simply import the package in Python.

Run tests:
```bash
uv run pytest
```

Import the package in Python:
```bash
python
>>> import llmgym
>>> llmgym.__version__
'0.0.0'
```

## Setting Environment Variables

To set the `OPENAI_API_KEY` environment variable, run the following command:
```bash
export OPENAI_API_KEY="your_openai_api_key"
```

We recommend using [direnv](https://direnv.net/) and creating a local `.envrc` file to manage environment variables. For example, the `.envrc` file might look like this:
```bash
export OPENAI_API_KEY="your_openai_api_key"
```

and then run `direnv allow` to load the environment variables.

## Quickstart

Start ipython with async support.
```bash
ipython --async=True
```
Run an episode of the 21-questions environment.
```python
import logging

import llmgym
from llmgym.logs import get_logger
from llmgym.agents import OpenAIAgent

logger = get_logger("llmgym")
logger.setLevel(logging.INFO)

env = llmgym.make("21_questions_v0")

agent = llmgym.agents.OpenAIAgent(
model_name="gpt-4o-mini",
function_configs=env.functions,
tool_configs=env.tools,
)
# Get default horizon
max_steps = env.horizon

# Reset the environment
reset_data = await env.reset()
obs = reset_data.observation

# Run the episode
for _step in range(max_steps):
# Get action from agent
action = await agent.act(obs)

# Step the environment
step_data = await env.step(action)
obs = step_data.observation

# Check if the episode is done
done = step_data.terminated or step_data.truncated
if done:
break
env.close()
```

This can also be run in the [Quickstart Notebook](examples/quickstart.ipynb).

## Tutorial

For a full tutorial, see the [Tutorial Notebook](examples/tutorial.ipynb).

To see how to run multiple episodes concurrently, see the [Tau Bench](examples/tau_bench.ipynb) or [21 Questions](examples/21_questions.ipynb) notebooks.

For a supervised finetuning example, see the [Supervised Finetuning Notebook](examples/supervised_fine_tuning.ipynb).