https://github.com/lx-0/computer-use-nodejs-demo

🤖 LLM-powered computer control through local and Docker environments. Features VNC integration, automated interactions, and a chat interface for natural language system control.
https://github.com/lx-0/computer-use-nodejs-demo

ai computer-use docker function-calling llm

Last synced: 7 months ago
JSON representation

🤖 LLM-powered computer control through local and Docker environments. Features VNC integration, automated interactions, and a chat interface for natural language system control.

Host: GitHub
URL: https://github.com/lx-0/computer-use-nodejs-demo
Owner: lx-0
Created: 2024-10-28T09:39:50.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-12-01T15:12:02.000Z (over 1 year ago)
Last Synced: 2024-12-01T16:23:13.225Z (over 1 year ago)
Topics: ai, computer-use, docker, function-calling, llm
Language: TypeScript
Homepage:
Size: 1.16 MB
Stars: 5
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # LLM-Controlled Computer

A Next.js application that uses a large language model to control a computer through both local system control and virtual machine (Docker) environments.

[![Next.js](https://img.shields.io/badge/Next.js-black?style=for-the-badge&logo=next.js&logoColor=white)](https://nextjs.org)

[![TypeScript](https://img.shields.io/badge/TypeScript-007ACC?style=for-the-badge&logo=typescript&logoColor=white)](https://www.typescriptlang.org)

[![Langchain](https://img.shields.io/badge/Langchain-000000?style=for-the-badge&logo=langchain&logoColor=white)](https://langchain.com)

[![Docker](https://img.shields.io/badge/Docker-2496ED?style=for-the-badge&logo=docker&logoColor=white)](https://www.docker.com)

[![shadcn/ui](https://img.shields.io/badge/shadcn--ui-000000?style=for-the-badge&logo=shadcnui&logoColor=white)](https://ui.shadcn.com)

[![Tailwind_CSS](https://img.shields.io/badge/Tailwind_CSS-38B2AC?style=for-the-badge&logo=tailwind-css&logoColor=white)](https://tailwindcss.com)

[![Electron](https://img.shields.io/badge/Electron-47848F?style=for-the-badge&logo=electron&logoColor=white)](https://www.electronjs.org)

[![Built with Cursor](https://img.shields.io/badge/Built_With-Cursor-5C4EE5?style=for-the-badge)](https://cursor.com)

![Screenshot](./public/images/screenshot.png)

> **🚧 Work in Progress**

>

> This project is under active development. Some features may be incomplete or subject to change.

>

> The overall goal is to create a tool that allows a user to control their computer with any large language model in nodejs. Anthropics [Computer Use Demo](https://github.com/anthropics/anthropic-quickstarts) is the main inspirational source for this project.

>

> Roadmap:

>

> - ✅ Docker container management

> - ✅ VNC integration

> - ✅ Chat interface

> - 🔳 (Generic) LLM integration

>   - ✅ Base architecture

>   - ✅ Model selection

>   - ✅ Model tracking

>   - ✅ Message history

>   - ✅ Local model support

>   - ✅ Model download tracking

>   - 🔳 Context management

>   - 🔳 Function calling

>   - ⬜ Streaming support

> - ⬜ Computer use tooling

>   - ⬜ File management

>   - ⬜ Screenshot analysis

>   - ⬜ Mouse and keyboard control

>   - ⬜ Bash command execution

> - 🔳 Launch options

>   - ⬜ CLI

>   - ✅ Web server

>   - ⬜ Electron app

> - 🔳 Computer Use modes

>   - ✅ Virtual (Docker)

>   - ⬜ Local (direct control)

> - ⬜ Conversation history

> - ⬜ Multi Agent support

> - ⬜ Memory management

>

> Please check back later for updates or **feel free to contribute!**

## Features

### Core Capabilities

- Screenshot analysis

- Mouse and keyboard control

- Bash command execution

- File management

- Chat interface for LLM interaction

- VNC-based graphical interactions

### Operation Modes

- **Local Mode**: Direct system control

- **Docker Mode**: Virtual machine control via Docker containers

- **Multiple Launch Options**:

  - Web browser (Next.js server)

  - Desktop application (Electron)

  - CLI for specific LLM tasks

### Docker Integration

- Real-time container management

- Build progress streaming

- Container lifecycle control (start, stop, delete)

- Status monitoring and detailed logging

- NoVNC integration for web-based access

- Automated environment setup

### User Interface

- Responsive split-view layout

- Settings sidebar

- Real-time Docker status indicators

- Expandable log entries

- Copy-to-clipboard functionality

- Auto-scrolling chat interface

## Tech Stack

- **Frontend**: Next.js with TypeScript

- **UI Components**: Radix UI, Tailwind CSS

- **Container Management**: Dockerode

- **Remote Access**: VNC, SSH2

- **LLM Integration**: Langchain.js

- **Desktop Packaging**: Electron

- **Terminal**: node-pty, xterm.js

## Prerequisites

- Node.js (LTS version)

- Docker

- Python 3.11.6 (for certain features)

- Ollama (for local models) - See [Ollama Setup](#ollama-setup) section

## Installation

### 1. Clone the repository

```bash

git clone [repository-url]

cd llm-controlled-computer

```

### 2. Install dependencies

```bash

npm install

```

### 3. Set up environment variables

```bash

cp .env.example .env

```

Edit `.env` with your configuration.

## Development

Start the development server:

```bash

npm run dev

```

### Building

For production build:

```bash

npm run build

```

For Electron desktop app:

```bash

npm run build:electron

```

## Docker Usage

The application includes a custom Docker environment with:

- Ubuntu 22.04 base

- Python environment with pyenv

- Desktop environment with VNC access

- Firefox ESR with pre-configured extensions

- Various utility applications

## Ollama Setup

### Installation

#### macOS

```bash

# Using Homebrew

brew install ollama

# Start Ollama service

ollama serve

```

#### Linux

```bash

# Install Ollama

curl -fsSL https://ollama.com/install.sh | sh

# Start Ollama service

systemctl start ollama

```

#### Windows

1. Install WSL2 if not already installed:

```bash

wsl --install

```

2. Install Ollama in WSL2:

```bash

curl -fsSL https://ollama.com/install.sh | sh

```

3. Start Ollama service in WSL2:

```bash

ollama serve

```

### Configuration

Add the following to your `.env` file:

```env

# Ollama Configuration

NEXT_PUBLIC_OLLAMA_URL=http://localhost:11434

```

### Troubleshooting

1. Check if Ollama is running:

```bash

curl http://localhost:11434/api/health

```

2. If not running, start the service:

```bash

# macOS/Linux

ollama serve

# Windows (in WSL2)

wsl -d Ubuntu -u root ollama serve

```

3. Common issues:

   - Port 11434 is already in use

   - Insufficient disk space

   - GPU drivers not properly installed (for GPU acceleration)

## Contributing

1. Ensure you follow the project's coding standards:

   - Use TypeScript with strict typing

   - Follow clean code principles

   - Write comprehensive tests

   - Add proper documentation

2. Submit pull requests with:

   - Clear description of changes

   - Test coverage

   - Documentation updates

## License

ISC

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/lx-0/computer-use-nodejs-demo

Awesome Lists containing this project

README