Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/sierra-research/tau-bench
Code and Data for Tau-Bench
https://github.com/sierra-research/tau-bench
Last synced: 2 months ago
JSON representation
Code and Data for Tau-Bench
- Host: GitHub
- URL: https://github.com/sierra-research/tau-bench
- Owner: sierra-research
- License: mit
- Created: 2024-06-06T17:11:36.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2024-10-17T17:22:07.000Z (3 months ago)
- Last Synced: 2024-10-20T01:58:25.201Z (3 months ago)
- Language: Python
- Size: 927 KB
- Stars: 112
- Watchers: 8
- Forks: 13
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- AwesomeResponsibleAI - τ-bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains
README
# τ-bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains
Paper: https://arxiv.org/abs/2406.12045
## Setup
1. Clone this repository:
```bash
git clone https://github.com/sierra-research/tau-bench && cd ./tau-bench
```2. Install from source (which also installs required packages):
```bash
pip install -e .
```3. Set up your OpenAI / Anthropic / Google / Mistral / AnyScale API keys as environment variables.
```bash
OPENAI_API_KEY=...
ANTHROPIC_API_KEY=...
GEMINI_API_KEY=...
MISTRAL_API_KEY=...
ANYSCALE_API_KEY=...
```## Run
Run a function calling agent on the τ-retail environment:```bash
python run.py --env retail --model gpt-4o --max_concurrency 10
```Set max concurrency according to your API limit.