https://github.com/kadavilrahul/browser-use-shell

A Shell script for using Python based Browser use tool to interact with browsers
https://github.com/kadavilrahul/browser-use-shell

Last synced: 4 months ago
JSON representation

A Shell script for using Python based Browser use tool to interact with browsers

Host: GitHub
URL: https://github.com/kadavilrahul/browser-use-shell
Owner: kadavilrahul
Created: 2025-01-27T11:55:30.000Z (5 months ago)
Default Branch: main
Last Pushed: 2025-02-05T16:57:49.000Z (5 months ago)
Last Synced: 2025-02-05T17:46:00.533Z (5 months ago)
Language: Python
Size: 25.4 KB
Stars: 1
Watchers: 1
Forks: 2
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

awesome - kadavilrahul/browser-use-shell - A Shell script for using Python based Browser use tool to interact with browsers (Python)

README

# Browser Automation with AI Model

- A minimal browser automation setup that integrates AI model
- It takes user input and executes tasks on a Chromium browser
- Built using Windsurf Code editor and Claude 3.5 Sonnet model
- This is simplified and modified version of the original code: https://github.com/browser-use

#### Modifications and improvement done
- The tool will be ready to use by running two commands on terminal
- Made it user friendly by adding user inputs on terminal for all the tasks
- Used only one LLM model for now to avoid API format errors
- Keeps browser open after task completion unless exit is done
- New tasks can be executed after current one is completed on already open brower

#### System Requirements
- A system with GUI (GUI is needed only to access browser)

#### Tested on OS:
- Ubuntu 24.04 LTS
- Garuda Linux (Rolling)
- Windows 10 Pro 64-bit

#### Note:
- Use any IDE with AI support to modify code as per your use.
- If you are working on a remote Linux headless machine then you will need a GUI and remote desktop connection to access browser. You can install Chrome remote desktop on a remote Ubuntu machine using this repo https://github.com/kadavilrahul/chrome_remote_desktop

#### Configured AI Models
At least one API key required:
- Google Gemini (Free API) - https://aistudio.google.com/apikey
- Configured Gemini model is "gemini-2.0-flash-exp"

## Setup and Installation
### Linux:
(Run these commands on Linux terminal to get started)
- Enter desired folder location. Modify command with correct folder name
```bash
cd /path/to/your/folder
```
- Git clone and enter repository folder
```bash
git clone https://github.com/kadavilrahul/browser-use-shell.git && cd browser-use-shell && bash main.sh
```

- Rerun the code after installation
```bash
source venv/bin/activate && python main.py
```
### Windows:
(Run these commands on Windows terminal (Powershell) or system terminal to get started)
- Enter desired folder location. Modify command with correct folder name
```powershell
cd Path\To\Your\Folder
```
- Git clone and enter repository folder
```powershell
git clone https://github.com/kadavilrahul/browser-use-shell.git; cd browser-use-shell; python -m venv venv; .\venv\Scripts\activate; pip install -r requirements.txt; python main.py
```
- Rerun the code after installation
```powershell
.\venv\Scripts\Activate; python main.py
```

#### User Inputs During Setup
1. LLM API Keys (required)
- Enter your Gemini API key when prompted
- Get it from https://aistudio.google.com/apikey

2. Task Input
- Enter your automation task and follow on-screen instructions
- Example tasks:
```
Go to wordpress order section of xxxx.com, ID:xxxx Password:xxxx and search for latest orders
```
```
Login to GitHub with username:xxx password:xxx and check notifications
```
- Task progress shown in terminal
- Results saved as GIF in `agent_history.gif`

#### Manual Python Usage
Start automation:
```bash
source venv/bin/activate
pip install -r requirements.txt
python main.py
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/kadavilrahul/browser-use-shell

Awesome Lists containing this project

README