https://github.com/datamarkin/agentui
https://github.com/datamarkin/agentui
Last synced: 5 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/datamarkin/agentui
- Owner: datamarkin
- License: mit
- Created: 2025-10-13T23:13:47.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2025-12-12T15:03:36.000Z (6 months ago)
- Last Synced: 2025-12-22T00:37:30.238Z (6 months ago)
- Language: Python
- Size: 968 KB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# AgentUI
Visual workflow builder for computer vision and AI. Create image processing pipelines by connecting tools in a drag-and-drop interface, then export and run them programmatically.
**Part of the [Datamarkin](https://datamarkin.com) ecosystem** - Built on [PixelFlow](https://pixelflow.datamarkin.com) and [Mozo](https://mozo.datamarkin.com) for production-ready computer vision.
## What It Is
AgentUI is a web-first tool builder that lets you:
- **Build visually**: Drag and drop tools on a canvas to create workflows
- **Connect tools**: Wire outputs to inputs with type-safe connections
- **Execute**: Run workflows in the browser and see results instantly
- **Export**: Save workflows as JSON for version control and programmatic execution
- **Integrate**: Use as a Python library in your own applications
Think of it as a visual programming environment for computer vision tasks.
## Quick Start
```bash
# Install
pip install agentui
# Start the server
agentui start
# Open http://localhost:8000 in your browser
```
That's it. The UI is already bundled - no separate build step needed.
## What You Can Build
### ML-Powered Tools
- **Object Detection**: Detect objects using YOLOv8 or Detectron2 (80 COCO classes)
- **Instance Segmentation**: Get pixel-level masks for detected objects
- **Depth Estimation**: Generate depth maps from single images (Depth Anything)
### Image Processing
- **Transforms**: Rotate, flip, crop images with automatic detection coordinate updates
- **Enhancement**: CLAHE enhancement, auto-contrast, gamma correction, image normalization
- **Analysis**: Color analysis, quality metrics, dominant color extraction
- **Blending**: Combine multiple images with alpha blending
### Annotation & Privacy
- **Draw Detections**: Bounding boxes, labels, masks, polygons
- **Privacy Protection**: Blur or pixelate regions automatically
- **Object Tracking**: Track objects across video frames
- **Zone Analysis**: Monitor object presence in defined areas
### Input/Output
- **Load**: Images from files or base64 data
- **Save**: Export processed images to disk
- **Web Display**: Convert images to base64 for browser display
## Usage
### Web Interface
1. **Add tools**: Drag tools from the left palette onto the canvas
2. **Connect**: Click and drag from output ports to input ports
3. **Configure**: Select a tool to edit its parameters in the right panel
4. **Execute**: Click "Run Workflow" to process
5. **View Results**: See outputs in the results panel
6. **Export**: Save your workflow as JSON
### Programmatic Usage
```python
from agentui import Workflow
# Load a workflow created in the UI
workflow = Workflow.load('my_workflow.json')
# Run with an image
result = workflow.run(image='test.jpg')
# Access outputs
detections = result['detections'] # PixelFlow Detections object
print(f"Found {len(detections)} objects")
# Batch processing (automatic)
result = workflow.run(image=['img1.jpg', 'img2.jpg', 'img3.jpg'])
for i, dets in enumerate(result['detections']):
print(f"Image {i}: {len(dets)} objects")
```
### Workflow Design Philosophy
**AgentUI is designed for visual workflow creation:**
- Create workflows using the drag-and-drop UI
- Export as JSON for version control
- Load and execute programmatically with the Python API
**Why not build workflows in code?** The visual interface is the fastest way to prototype CV pipelines. The Python API focuses on *execution* (loading and running workflows), not construction. This separation keeps the codebase simple and the workflow format UI-native.
## The Datamarkin Ecosystem
AgentUI integrates two powerful libraries:
- **[PixelFlow](https://pixelflow.datamarkin.com)**: Computer vision primitives (annotation, tracking, spatial analysis)
- **[Mozo](https://mozo.datamarkin.com)**: Universal model serving (object detection, segmentation, depth estimation)
These libraries are maintained by the same team and designed to work together seamlessly.
## Development
### UI Development
Only needed if you're modifying the UI:
```bash
cd ui
npm install
npm run dev # Development server with hot reload at http://localhost:5173
# When done
npm run build # Builds to ../agentui/static/
```
### Adding Custom Tools
Tools are Python classes that inherit from `Tool`:
```python
from agentui.core.tool import Tool, ToolOutput, Port, PortType
class MyCustomTool(Tool):
@property
def tool_type(self) -> str:
return "MyCustomTool"
@property
def input_ports(self) -> Dict[str, Port]:
return {"image": Port("image", PortType.IMAGE, "Input image")}
@property
def output_ports(self) -> Dict[str, Port]:
return {"image": Port("image", PortType.IMAGE, "Output image")}
def process(self) -> bool:
image = self.inputs["image"].data
# Do something with the image
self.outputs["image"] = ToolOutput(processed_image, PortType.IMAGE)
return True
```
Tools are automatically discovered by the registry. See `CLAUDE.md` for detailed development guidance.
## Installation for Development
```bash
git clone
cd agentui
# Python setup
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -e .
# Start server
agentui start
# Optional: UI development (only if modifying Svelte code)
cd ui
npm install
npm run build
```
## Roadmap
Future additions will focus on:
- Additional ML models (OCR, classification, keypoint detection)
- Vision-language models (GPT-4V, Claude, Gemini, Qwen-VL)
- Cloud storage integrations (S3, GCS, Azure)
- Advanced tracking and analytics
- Real-time streaming workflows
## Documentation
- **[CLAUDE.md](CLAUDE.md)**: Complete developer guide and architecture documentation
## Requirements
- Python 3.9+
- Optional: Node.js 18+ (only for UI development)
## License
MIT License - see LICENSE file for details
## Contributing
Contributions welcome! Please check `CLAUDE.md` for development guidelines and architecture overview.
---
**Built by [Datamarkin](https://datamarkin.com)** - Making computer vision accessible through visual workflows.