Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mayt/BrowserGPT
Command your browser with GPT
https://github.com/mayt/BrowserGPT
Last synced: 14 days ago
JSON representation
Command your browser with GPT
- Host: GitHub
- URL: https://github.com/mayt/BrowserGPT
- Owner: mayt
- License: mit
- Created: 2023-03-19T12:52:20.000Z (over 1 year ago)
- Default Branch: master
- Last Pushed: 2024-07-24T08:50:48.000Z (4 months ago)
- Last Synced: 2024-08-01T19:32:20.698Z (3 months ago)
- Language: JavaScript
- Homepage:
- Size: 21.9 MB
- Stars: 363
- Watchers: 9
- Forks: 44
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
# BrowserGPT
This project allows you to control your browser using natural language. It integrates OpenAI's GPT-4 with the Playwright library, enabling seamless browser navigation. GPT-4 generates code snippets, which Playwright executes to carry out specified tasks.
## Demo
![BrowserGPT in action](./public/browsergpt.gif)
## Installation
### Install the required packages:
```sh
npm install
```### Create a `.env` file in the project root directory and add the following line:
```
OPENAI_API_KEY=your_openai_api_key
```Replace `your_openai_api_key` with your actual OpenAI API key.
#### First run only:
You may need to install Playwright executables. Run the following to install them.
```sh
npx playwright install
```### Run the script:
```sh
npm run start
```### Options:
```
Usage: npm run start -- [options]Options:
-a, --autogpt run with autogpt (default: false)
-m, --model openai model to use (default: "gpt-4-1106-preview")
-o, --outputFilePath path to store test code
-u, --url url to start on (default: "https://www.google.com")
-v, --viewport viewport size to use (default: "1280,720")
-h, --help display help for command
```## Usage
The script opens a browser window.
In the terminal, you'll be prompted to enter a task.
Type your task using natural language (e.g., "Generate an interesting phrase and type it into Google") and press Enter.
GPT-4 can recognize buttons and text on the page and will navigate the browser to complete the specified task.
To stop the script, press `Ctrl + C` in the terminal.
## Examples
Here are some example tasks you can input:
- `go to hn`
- `click on the abc article`
- `enter [email protected] into the email box. John and Doe in the first and last name boxes respectively`
- `generate a spicy comment on what xyz said and put it in the comment box`With `autogpt` enabled, you can also input more complex tasks like:
- `go to hn and click on the first article`
- `use bing and find the abc article`## Limitations
This script serves as a demonstration of GPT-4 and Playwright integration, and may not perform flawlessly for every task or website. Generated code snippets could fail to execute, or the model might not comprehend specific inputs. Consider providing a more detailed task description or rephrasing your input in these situations. Some websites might be too large to fit in the prompt for smaller models like base `gpt-4`, hence we default to `gpt-4-1106-preview` with 125k tokens.
## License
This project is licensed under the MIT License. See the LICENSE file for details.