An open API service indexing awesome lists of open source software.

https://github.com/The-Swarm-Corporation/AgentOS

AgentOS implements a comprehensive security architecture leveraging containerization, orchestration, and multi-layer isolation to ensure secure execution of autonomous agents.
https://github.com/The-Swarm-Corporation/AgentOS

agentos agents ai ml multi-agent operating-system swarms

Last synced: about 1 month ago
JSON representation

AgentOS implements a comprehensive security architecture leveraging containerization, orchestration, and multi-layer isolation to ensure secure execution of autonomous agents.

Awesome Lists containing this project

README

          

# AgentOS

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)

A minimal, production-ready implementation of Andrej Karpathy's Agent Operating System architecture, developed by Swarms.ai and partners.

![AgentOS Architecture](https://miro.medium.com/v2/resize:fit:748/1*quuHoEjoCzxvu5lVp_SMEQ@2x.jpeg)

## Overview

AgentOS is a lightweight, single-file implementation that provides a robust foundation for building autonomous AI agents. It implements the core concepts outlined in Karpathy's Agent OS architecture while maintaining simplicity and extensibility. Developed by [Swarms.ai](https://swarms.ai) and its partners, AgentOS is a production-ready implementation of autonomous AI agents that follows the architectural principles outlined by Andrej Karpathy.

## Features

- **Unified Model Interface**: Seamless integration with multiple LLM providers through LiteLLM
- Support for Anthropic Claude models (Opus, Sonnet, Haiku)
- Integration with OpenAI GPT models
- Access to optimized variants (GPT-4o, GPT-4o-mini)
- **Browser Automation**: Built-in browser agent capabilities for web interaction using browser-use
- **Multi-Modal Support**:
- Text processing and generation
- Video analysis through Google's Gemini models
- Audio processing and speech synthesis
- Image handling capabilities
- **Resource Management**:
- Efficient handling of computational resources
- Dynamic model selection based on task requirements
- Automatic GPU/CPU optimization
- **HuggingFace Integration**:
- Direct access to open-source models
- Support for text generation and multiple NLP tasks
- Automatic model quantization and optimization
- **Extensible Architecture**: Easy to add new capabilities and tools

## Core Components

- **Model Management**: Dynamic selection and utilization of language models
- **Browser Automation**: Autonomous web-based task execution
- **Resource Orchestration**: Efficient management of computational resources
- **Context Management**: Maintains system state and task dependencies

## Installation

```bash
pip3 install -U agentos-sdk
```

## Usage

```python
from agentos_sdk import AgentOS
from dotenv import load_dotenv

load_dotenv()

agent = AgentOS(plan_on=False, max_loops=1)

agent.run(
"Generate a video of a cat surfing on a wave at sunset, cinematic style. Save it as 'cat_surfing.mp4. We should also add cat sounds and meowing sounds."
)

```

## Available Tools

AgentOS comes with a powerful set of built-in tools that enable various capabilities. Here's a comprehensive list of all available tools:

| Tool Name | Description | Use Case Examples |
|-----------|-------------|------------------|
| Browser Agent | Autonomous web browser automation tool that can navigate websites, extract information, and perform web-based tasks | - Web scraping
- Form filling
- Data extraction
- Website testing |
| Hugging Face Model | Interface for using various Hugging Face models for text generation and other NLP tasks | - Text generation
- Language translation
- Text classification
- Custom model inference |
| LiteLLM Model | Unified interface for multiple LLM providers including OpenAI, Anthropic, and others | - Text generation
- Chat completion
- Content creation
- Advanced reasoning |
| Safe Calculator | Secure mathematical expression evaluator with built-in safety checks | - Mathematical calculations
- Formula evaluation
- Secure computation
- Numeric processing |
| Terminal Developer Agent | Advanced agent for performing terminal operations and development tasks | - File operations
- Code execution
- System commands
- Development tasks |
| Generate Speech | Text-to-speech conversion tool supporting multiple voices and models | - Audio content creation
- Voice synthesis
- Accessibility features
- Audio narration |
| Generate Video | AI-powered video generation tool using Google's Veo 3.0 model | - Video content creation
- Visual storytelling
- Animation generation
- Creative content |
| Create Files | Tool for creating new files in the workspace with specified content | - Document creation
- Code file generation
- Report writing
- Configuration files |
| Update Files | Tool for updating existing files by overwriting their content | - Content modification
- File updates
- Document revisions
- Configuration changes |

## Community

Join our community of agent engineers and researchers for technical support, cutting-edge updates, and exclusive access to world-class agent engineering insights!

| Platform | Description | Link |
|----------|-------------|------|
| 📚 Documentation | Official documentation and guides | [docs.swarms.world](https://docs.swarms.world) |
| 📝 Blog | Latest updates and technical articles | [Medium](https://medium.com/@kyeg) |
| 💬 Discord | Live chat and community support | [Join Discord](https://discord.gg/jM3Z6M9uMq) |
| 🐦 Twitter | Latest news and announcements | [@kyegomez](https://twitter.com/swarms_corp) |
| 👥 LinkedIn | Professional network and updates | [The Swarm Corporation](https://www.linkedin.com/company/the-swarm-corporation) |
| 📺 YouTube | Tutorials and demos | [Swarms Channel](https://www.youtube.com/channel/UC9yXyitkbU_WSy7bd_41SqQ) |
| 🎫 Events | Join our community events | [Sign up here](https://lu.ma/5p2jnc2v) |
| 🚀 Onboarding Session | Get onboarded with Kye Gomez, creator and lead maintainer of Swarms | [Book Session](https://cal.com/swarms/swarms-onboarding-session) |

## Contributing

We welcome contributions from the community. Please see our contributing guidelines for more information.

## License

This project is under the MIT License.

## Todo

- [ ] Add deep research agent or sub agent
- [ ] Implement video and audio processing
- [ ] Create better system prompt and add multiple shot examples on when to use certain tools and etc