{"id":13577221,"url":"https://github.com/noahshinn/reflexion","last_synced_at":"2025-05-14T03:11:29.522Z","repository":{"id":147157257,"uuid":"617325145","full_name":"noahshinn/reflexion","owner":"noahshinn","description":"[NeurIPS 2023] Reflexion: Language Agents with Verbal Reinforcement Learning","archived":false,"fork":false,"pushed_at":"2025-01-14T07:54:02.000Z","size":9105,"stargazers_count":2701,"open_issues_count":18,"forks_count":258,"subscribers_count":29,"default_branch":"main","last_synced_at":"2025-05-06T01:02:00.046Z","etag":null,"topics":["ai","artificial-intelligence","llm"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/noahshinn.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-03-22T06:38:53.000Z","updated_at":"2025-05-05T09:10:24.000Z","dependencies_parsed_at":null,"dependency_job_id":"f947cf08-78e1-4db1-8c03-f43b53fb0e28","html_url":"https://github.com/noahshinn/reflexion","commit_stats":{"total_commits":164,"total_committers":8,"mean_commits":20.5,"dds":0.5731707317073171,"last_synced_commit":"d15acda1c81d464d9a81648d7f29fb951e326c70"},"previous_names":["noahshinn024/reflexion","noahshinn/reflexion"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/noahshinn%2Freflexion","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/noahshinn%2Freflexion/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/noahshinn%2Freflexion/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/noahshinn%2Freflexion/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/noahshinn","download_url":"https://codeload.github.com/noahshinn/reflexion/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253736100,"owners_count":21955782,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","artificial-intelligence","llm"],"created_at":"2024-08-01T15:01:19.411Z","updated_at":"2025-05-14T03:11:24.507Z","avatar_url":"https://github.com/noahshinn.png","language":"Python","funding_links":[],"categories":["Python","[:robot: machine-learning]([robot-machine-learning)](\u003chttps://github.com/stars/ketsapiwiq/lists/robot-machine-learning\u003e))","2. Long-CoT的能力介绍"],"sub_categories":["2.2 适度反思（Feasible Reflection）"],"readme":"# [NeurIPS 2023] Reflexion: Language Agents with Verbal Reinforcement Learning\n\nThis repo holds the code, demos, and log files for [Reflexion: Language Agents with Verbal Reinforcement Learning](https://arxiv.org/abs/2303.11366) by Noah Shinn, Federico Cassano, Edward Berman, Ashwin Gopinath, Karthik Narasimhan, Shunyu Yao. \n\n![Reflexion RL diagram](./figures/reflexion_rl.png)\n\n![Reflexion tasks](./figures/reflexion_tasks.png)\n\nWe have released the LeetcodeHardGym [here](https://github.com/GammaTauAI/leetcode-hard-gym)\n\n## To Run: reasoning (HotPotQA)\n\nWe have provided a set of notebooks to easily run, explore, and interact with the results of the reasoning experiments. Each experiment consists of a random sample of 100 questions from the HotPotQA distractor dataset. Each question in the sample is attempted by an agent with a specific type and reflexion strategy.\n\n### Setup\n\nTo get started:\n\n1. Clone this repo and move to the HotPotQA directory:\n\n```bash\ngit clone https://github.com/noahshinn/reflexion \u0026\u0026 cd ./hotpotqa_runs\n```\n\n2. Install the module dependencies into your environment:\n\n```bash\npip install -r requirements.txt\n```\n\n3. Set `OPENAI_API_KEY` environment variable to your OpenAI API key:\n\n```bash\nexport OPENAI_API_KEY=\u003cyour key\u003e\n```\n\n#### Agent Types\n\nAgent type is determined by the notebook you choose to run. The available agent types include:\n\n- `ReAct` - ReAct Agent\n\n- `CoT_context` - CoT Agent given supporting context about the question \n\n- `CoT_no_context` - CoT Agent given no supporting context about the question\n\nThe notebook for each agent type is located in the `./hotpot_runs/notebooks` directory.\n\n#### Reflexion Strategies\n\nEach notebook allows you to specify the reflexion strategy to be used by the agents. The available reflexion strategies, which are defined in an `Enum`, include:\n\n- `ReflexionStrategy.NONE` - The agent is not given any information about its last attempt. \n\n- `ReflexionStrategy.LAST_ATTEMPT` - The agent is given its reasoning trace from its last attempt on the question as context.\n\n- `ReflexionStrategy.REFLEXION` - The agent is given its self-reflection on the last attempt as context. \n\n- `ReflexionStrategy.LAST_ATTEMPT_AND_REFLEXION` -  The agent is given both its reasoning trace and self-reflection on the last attempt as context.\n\n### To Run: decision-making (AlfWorld)\n\nClone this repo and move to the AlfWorld directory\n\n```bash\ngit clone https://github.com/noahshinn/reflexion \u0026\u0026 cd ./alfworld_runs\n```\n\nSpecify the run parameters in `./run_reflexion.sh`.\n`num_trials`: number of iterative learning steps\n`num_envs`: number of task-environment pairs per trial\n`run_name`: the name for this run\n`use_memory`: use persisting memory to store self-reflections (turn off to run a baseline run)\n`is_resume`: use logging directory to resume a previous run\n`resume_dir`: the logging directory from which to resume the previous run\n`start_trial_num`: if resume run, then the trial number of which to start\n\nRun the trial\n\n```bash\n./run_reflexion.sh\n```\n\nThe logs will be sent to `./root/\u003crun_name\u003e`.\n\n### Another Note\n\nDue to the nature of these experiments, it may not be feasible for individual developers to rerun the results as GPT-4 has limited access and significant API charges. All runs from the paper and additional results are logged in `./alfworld_runs/root` for decision-making, `./hotpotqa_runs/root` for reasoning, and `./programming_runs/root` for programming\n\n### Other Notes\n\nCheck out the original implementation [here](https://github.com/noahshinn/reflexion-draft)\n\nRead one of the original blog posts [here](https://nanothoughts.substack.com/p/reflecting-on-reflexion)\n\nCheck out an [Appl](https://github.com/appl-team/appl) implementation [here](https://github.com/appl-team/reppl/tree/main/reflexion).\n\nCheck out an interesting type-prediction implementation here: [OpenTau](https://github.com/GammaTauAI/opentau)\n\nFor all questions, contact [noahrshinn@gmail.com](noahrshinn@gmail.com)\n\n### Cite\n\n```bibtex\n@misc{shinn2023reflexion,\n      title={Reflexion: Language Agents with Verbal Reinforcement Learning}, \n      author={Noah Shinn and Federico Cassano and Edward Berman and Ashwin Gopinath and Karthik Narasimhan and Shunyu Yao},\n      year={2023},\n      eprint={2303.11366},\n      archivePrefix={arXiv},\n      primaryClass={cs.AI}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnoahshinn%2Freflexion","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnoahshinn%2Freflexion","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnoahshinn%2Freflexion/lists"}