https://github.com/noahshinn/reflexion

[NeurIPS 2023] Reflexion: Language Agents with Verbal Reinforcement Learning
https://github.com/noahshinn/reflexion

ai artificial-intelligence llm

Last synced: 16 days ago
JSON representation

[NeurIPS 2023] Reflexion: Language Agents with Verbal Reinforcement Learning

Host: GitHub
URL: https://github.com/noahshinn/reflexion
Owner: noahshinn
License: mit
Created: 2023-03-22T06:38:53.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2025-01-14T07:54:02.000Z (5 months ago)
Last Synced: 2025-05-06T01:02:00.046Z (25 days ago)
Topics: ai, artificial-intelligence, llm
Language: Python
Homepage:
Size: 8.68 MB
Stars: 2,701
Watchers: 29
Forks: 258
Open Issues: 18
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-technostructure - noahshinn/reflexion
awesome-technostructure - noahshinn/reflexion

README

# [NeurIPS 2023] Reflexion: Language Agents with Verbal Reinforcement Learning

This repo holds the code, demos, and log files for [Reflexion: Language Agents with Verbal Reinforcement Learning](https://arxiv.org/abs/2303.11366) by Noah Shinn, Federico Cassano, Edward Berman, Ashwin Gopinath, Karthik Narasimhan, Shunyu Yao.

![Reflexion RL diagram](./figures/reflexion_rl.png)

![Reflexion tasks](./figures/reflexion_tasks.png)

We have released the LeetcodeHardGym [here](https://github.com/GammaTauAI/leetcode-hard-gym)

## To Run: reasoning (HotPotQA)

We have provided a set of notebooks to easily run, explore, and interact with the results of the reasoning experiments. Each experiment consists of a random sample of 100 questions from the HotPotQA distractor dataset. Each question in the sample is attempted by an agent with a specific type and reflexion strategy.

### Setup

To get started:

1. Clone this repo and move to the HotPotQA directory:

```bash
git clone https://github.com/noahshinn/reflexion && cd ./hotpotqa_runs
```

2. Install the module dependencies into your environment:

```bash
pip install -r requirements.txt
```

3. Set `OPENAI_API_KEY` environment variable to your OpenAI API key:

```bash
export OPENAI_API_KEY=
```

#### Agent Types

Agent type is determined by the notebook you choose to run. The available agent types include:

- `ReAct` - ReAct Agent

- `CoT_context` - CoT Agent given supporting context about the question

- `CoT_no_context` - CoT Agent given no supporting context about the question

The notebook for each agent type is located in the `./hotpot_runs/notebooks` directory.

#### Reflexion Strategies

Each notebook allows you to specify the reflexion strategy to be used by the agents. The available reflexion strategies, which are defined in an `Enum`, include:

- `ReflexionStrategy.NONE` - The agent is not given any information about its last attempt.

- `ReflexionStrategy.LAST_ATTEMPT` - The agent is given its reasoning trace from its last attempt on the question as context.

- `ReflexionStrategy.REFLEXION` - The agent is given its self-reflection on the last attempt as context.

- `ReflexionStrategy.LAST_ATTEMPT_AND_REFLEXION` - The agent is given both its reasoning trace and self-reflection on the last attempt as context.

### To Run: decision-making (AlfWorld)

Clone this repo and move to the AlfWorld directory

```bash
git clone https://github.com/noahshinn/reflexion && cd ./alfworld_runs
```

Specify the run parameters in `./run_reflexion.sh`.
`num_trials`: number of iterative learning steps
`num_envs`: number of task-environment pairs per trial
`run_name`: the name for this run
`use_memory`: use persisting memory to store self-reflections (turn off to run a baseline run)
`is_resume`: use logging directory to resume a previous run
`resume_dir`: the logging directory from which to resume the previous run
`start_trial_num`: if resume run, then the trial number of which to start

Run the trial

```bash
./run_reflexion.sh
```

The logs will be sent to `./root/`.

### Another Note

Due to the nature of these experiments, it may not be feasible for individual developers to rerun the results as GPT-4 has limited access and significant API charges. All runs from the paper and additional results are logged in `./alfworld_runs/root` for decision-making, `./hotpotqa_runs/root` for reasoning, and `./programming_runs/root` for programming

### Other Notes

Check out the original implementation [here](https://github.com/noahshinn/reflexion-draft)

Read one of the original blog posts [here](https://nanothoughts.substack.com/p/reflecting-on-reflexion)

Check out an [Appl](https://github.com/appl-team/appl) implementation [here](https://github.com/appl-team/reppl/tree/main/reflexion).

Check out an interesting type-prediction implementation here: [OpenTau](https://github.com/GammaTauAI/opentau)

For all questions, contact [[email protected]]([email protected])

### Cite

```bibtex
@misc{shinn2023reflexion,
title={Reflexion: Language Agents with Verbal Reinforcement Learning},
author={Noah Shinn and Federico Cassano and Edward Berman and Ashwin Gopinath and Karthik Narasimhan and Shunyu Yao},
year={2023},
eprint={2303.11366},
archivePrefix={arXiv},
primaryClass={cs.AI}
}
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/noahshinn/reflexion

Awesome Lists containing this project

README