{"id":13434610,"url":"https://github.com/posgnu/rci-agent","last_synced_at":"2026-03-05T16:14:37.872Z","repository":{"id":170475449,"uuid":"622581880","full_name":"posgnu/rci-agent","owner":"posgnu","description":"A codebase for \"Language Models can Solve Computer Tasks\"","archived":false,"fork":false,"pushed_at":"2024-05-01T03:07:29.000Z","size":2243,"stargazers_count":233,"open_issues_count":2,"forks_count":32,"subscribers_count":8,"default_branch":"main","last_synced_at":"2025-03-18T01:35:51.956Z","etag":null,"topics":["large-language-models","prompting","reasoning"],"latest_commit_sha":null,"homepage":"https://posgnu.github.io/rci-web/","language":"HTML","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/posgnu.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-04-02T14:38:43.000Z","updated_at":"2025-03-06T04:52:21.000Z","dependencies_parsed_at":"2024-10-27T17:12:25.336Z","dependency_job_id":"771bd099-c86f-4ff5-9f10-4e01385df143","html_url":"https://github.com/posgnu/rci-agent","commit_stats":null,"previous_names":["posgnu/rci-agent"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/posgnu%2Frci-agent","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/posgnu%2Frci-agent/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/posgnu%2Frci-agent/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/posgnu%2Frci-agent/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/posgnu","download_url":"https://codeload.github.com/posgnu/rci-agent/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248881462,"owners_count":21176858,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["large-language-models","prompting","reasoning"],"created_at":"2024-07-31T03:00:18.839Z","updated_at":"2026-03-05T16:14:37.832Z","avatar_url":"https://github.com/posgnu.png","language":"HTML","readme":"# RCI Agent for MiniWoB++\nWelcome to the codebase for our paper, \"Language Models can Solve Computer Tasks\". In this codebase, you will find the implementation of our RCI agent, which uses a pre-trained language model to execute computer tasks in [MiniWoB++ benchmark](http://miniwob.farama.org/) guided by natural language. The agent employs a simple RCI prompting scheme that allows it to improve its outputs.\n\n![overview](./artifacts/overview.gif)\n\n[[Website]](https://posgnu.github.io/rci-web/)\n[[Arxiv Paper]](https://arxiv.org/abs/2303.17491v1)\n[[PDF]](https://arxiv.org/pdf/2303.17491v1.pdf)\n\n\n## Dependencies\nThe RCI agent is implemented in Python 3.9 and requires the following dependencies:\n\n* gym\n* openai\n* selenium\n* Pillow\n* regex\n\n```sh\npip install -r requirements.txt\n```\nNote: [MiniWoB++](https://github.com/stanfordnlp/wge) is not officially supported on Windows. Please refer to [this issue](https://github.com/posgnu/rci-agent/issues/2).\n\n## Usage\n\n### Setup\nTo run the code, you must first install MiniWoB++ and configure your OpenAI API key. MiniWoB++ is integrated with the OpenAI Gym environment. Navigate to the `computergym` directory and execute the following command to install it:\n```sh\ncd computergym\npip install -e .\n```\nOnce that's done, you need to write your OpenAI API key in the `example_config.json` file, then rename the file to `config.json`\n\n### Run\nTo run the code, simply execute the following command:\n```sh\npython main.py --env [TASK NAME] --llm [LLM NAME] --num-episodes [NUM EPISODES] --erci [NUM Explicit RCI] --irci [NUM Implicit RCI] --sgrounding\n```\nHere are the arguments you need to specify:\n* `--env`: Name of the MiniWoB++ task you want to run. You can see the list of available tasks in `available_tasks.txt`\n* `--llm`: Name of the language model you want to use. The model name and corresponding API name are specified below:\n    * chatgpt: \"gpt-3.5-turbo\"\n    * davinci: \"text-davinci-003\"\n    * ada: \"ada\"\n    * babbage: \"babbage\"\n    * curie: \"curie\"\n    * davinci1: \"davinci\"\n    * davinci2: \"text-davinci-002\"\n* `--num-episodes`: Number of episodes to run the task\n* `--erci`: The number of explicit RCI loop for an action plan. `-1` will remove the action plan sampling.\n* `--irci`: The number of implicit RCI loop for the agent grounding.\n* `--sgrounding`: If this is True, then the state grounding update is enabled.\n* `--headless`: If this is True, then the MiniWoB++ environment will run in headless mode.\n\nConsider running the following command to verify if everything is functioning correctly:\n```sh\npython main.py --env choose-list --llm chatgpt --num-episodes 1 --irci 1 --sgrounding\n```\n\n## Evaluation\nOur project's approach has yielded impressive results, with our agent achieving the second-highest score out of all tested models. We have observed that our agent outperforms the baselines, with the exception of CC-Net (SL + RL), which uses dictionary-based typing actions.\n\n![](/artifacts/baseline-1.png)\n\nWhat sets our RCI agent apart is that it accomplished this feat using 120 times fewer samples than WebN-T5-3B and 11,000 times fewer samples than CC-Net. Obtaining expert demonstrations and defining reward functions for computer tasks can be a daunting challenge, but our research highlights the potential of using LLMs to overcome these obstacles and achieve success in general computer tasks.\n\n![](/artifacts/demos-1.png)\n\n## Check out our paper! \n\nOur paper is available on [Arxiv](https://arxiv.org/abs/2303.17491v1). If you use this code in your research, we kindly ask that you cite our paper.\n\n```bibtex\n@article{kim2023language,\n      title={Language Models can Solve Computer Tasks}, \n      author={Geunwoo Kim and Pierre Baldi and Stephen McAleer},\n      journal={arXiv preprint arXiv:2303.17491},\n      year={2023},\n}\n```\n","funding_links":[],"categories":["Applications","Papers"],"sub_categories":["Models"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fposgnu%2Frci-agent","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fposgnu%2Frci-agent","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fposgnu%2Frci-agent/lists"}