https://github.com/berrybytes/01agent
Kubernetes Alert Remediation System An intelligent Kubernetes alert remediation platform powered by LLM agents and LangGraph. Features a modern React web interface (L0) and specialized remediation agent (L1) that analyzes monitoring alerts, retrieves live cluster context via MCP, and generates executable remediation scripts.
https://github.com/berrybytes/01agent
a2a-protocol agentic-ai-development agentic-workflow deepagents-langgraph k8s kubernetes lanchain langraph mcp-server sre-agent-tools
Last synced: about 1 month ago
JSON representation
Kubernetes Alert Remediation System An intelligent Kubernetes alert remediation platform powered by LLM agents and LangGraph. Features a modern React web interface (L0) and specialized remediation agent (L1) that analyzes monitoring alerts, retrieves live cluster context via MCP, and generates executable remediation scripts.
- Host: GitHub
- URL: https://github.com/berrybytes/01agent
- Owner: BerryBytes
- License: mit
- Created: 2026-03-06T06:05:13.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2026-04-07T11:51:23.000Z (about 2 months ago)
- Last Synced: 2026-04-07T13:29:12.761Z (about 2 months ago)
- Topics: a2a-protocol, agentic-ai-development, agentic-workflow, deepagents-langgraph, k8s, kubernetes, lanchain, langraph, mcp-server, sre-agent-tools
- Language: Python
- Homepage:
- Size: 663 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE.md
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
# 01Agents: Kubernetes Alert Remediation System
[](https://www.python.org/downloads/)
[](https://reactrouter.com/)
[](https://langchain.com/langgraph)
[](https://modelcontextprotocol.io/)
A Kubernetes alert remediation system with a modern web interface and intelligent remediation agent. This platform analyzes monitoring alerts, fetches live cluster context, and generates executable remediation scripts through LLM‑powered workflows.
## 🏗️ Architecture Overview
The system follows a two‑tier hierarchical architecture:
```
Monitoring Alerts → L0 Client (UI) → L1 Remediation Agent → Remediation Scripts
```
### **L0 Client: React Router v7 Web Interface**
- **Role**: Modern web interface for alert management and visualization.
- **Port**: Default `3000`.
### **L1: Kubernetes Alert Remediation Agent**
- **Role**: LangGraph‑powered remediation specialist using specialized subagents.
- **Port**: Default `10001`.
---
## 🚀 Quick Start
### Clone the Repository
```bash
git clone git@github.com:BerryBytes/01agent.git
cd 01agent
```
### Choose Your Deployment Approach
- **[Approach 1: Using Pre-built Images](#approach-1-using-pre-built-images)** (Recommended for most users)
- **[Approach 2: Building Custom Images](#approach-2-building-custom-images)** (For developers modifying source code)
---
## Approach 1: Using Pre-built Images
Deploy using ready-to-use images from the `01community` registry. Perfect for quick setup and production use.
### Prerequisites
- **Kubernetes cluster** (Kind, Minikube, or Cloud)
- **Helm 3+**
- **LLM API Key** (OpenRouter, DeepSeek, Google, or Anthropic)
### Step 1: Install MCP Server for Kubernetes
The MCP (Model Context Protocol) Server provides Kubernetes tools for the agents.
#### Clone the MCP Server repository
```bash
git clone https://github.com/Flux159/mcp-server-kubernetes.git
```
#### Update the schema (Required)
```bash
python3 -c "
import json
with open('mcp-server-kubernetes/helm-chart/values.schema.json') as f:
schema = json.load(f)
schema['properties']['observability'] = {
'type': 'object',
'additionalProperties': True
}
with open('mcp-server-kubernetes/helm-chart/values.schema.json', 'w') as f:
json.dump(schema, f, indent=2)
print('Schema updated successfully!')
"
```
#### Install the MCP Server
```bash
helm install mcp-server ./mcp-server-kubernetes/helm-chart \
--set kubeconfig.provider=serviceaccount \
--set transport.mode=http \
--set transport.service.type=ClusterIP \
--set security.allowOnlyNonDestructive=false \
--create-namespace \
--namespace mcp-system
```
### Step 2: Install PostgreSQL Operator
The agents use PostgreSQL for long-term memory. We use the CrunchyData Operator to manage it.
#### Install OLM (Operator Lifecycle Manager)
```bash
curl -sL https://github.com/operator-framework/operator-lifecycle-manager/releases/download/v0.40.0/install.sh | bash -s v0.40.0
```
> [!IMPORTANT]
> OLM installation can take a few minutes. Please ensure all OLM pods in the `olm` namespace are in the **Running** state before proceeding.
#### Install the PostgreSQL Operator
```bash
kubectl create -f https://operatorhub.io/install/postgresql.yaml
```
### Step 3: Configure Agent Settings
Edit `helm-chart/values.yaml` to configure your LLM provider and API key.
You must set the `MODEL_PROVIDER`, `MODEL_NAME` and provide the corresponding API key in the `secret` section.
```yaml
agents:
- name: l0
enabled: true
image: 01community/agent-l0:v1
- name: l1
enabled: true
image: 01community/agent-l1:v1
env:
MODEL_PROVIDER: deepseek # options: gemini, openai, openrouter, anthropic, deepseek
MODEL_NAME: deepseek-chat # examples: gemini-2.0-flash, gpt-4o, claude-3-5-sonnet
MCP_SERVER_URL: http://mcp-server-mcp-server-kubernetes.mcp-system.svc.cluster.local:3001/mcp
ENABLE_K8S_TOOLS: "true"
STM_ENABLE_POSTGRES: "true"
usePostgresql: true
secret:
DEEPSEEK_API_KEY: "your-api-key-here"
# GOOGLE_API_KEY: "your-api-key"
# OPENAI_API_KEY: "your-api-key"
# OPENROUTER_API_KEY: "your-api-key"
# ANTHROPIC_API_KEY: "your-api-key"
```
### Step 4: Deploy 01Agents
```bash
helm upgrade --install 01agent ./helm-chart -n 01cloud --create-namespace
```
> [!IMPORTANT]
> Ensure all the pods in the namespace have initialized successfully and running
### Step 5: Access the Application
Port-forward the UI service:
```bash
kubectl port-forward svc/agent-l0 -n 01cloud 3000:3000
```
Open your browser and navigate to `http://localhost:3000`
### Step 6: Monitoring & Observability (Optional)
For advanced monitoring with Grafana, Loki, Tempo, and OpenTelemetry, see [OTEL-setup.md](./OTEL-setup.md). These features are **disabled by default** and should only be enabled if you have the observability stack configured.
---
## Approach 2: Building Custom Images
Build and deploy your own modified images from source code. Ideal for developers customizing the system.
### Prerequisites
- **Kubernetes cluster** (Kind, Minikube, or Cloud)
- **Helm 3+**
- **Docker** (for building images)
- **Node.js 25+** & **npm** (for L0 Client modifications)
- **Python 3.11+** (for L1 Agent modifications)
- **LLM API Key** (OpenRouter, DeepSeek, Google, or Anthropic)
### Step 1: Install MCP Server for Kubernetes
The MCP (Model Context Protocol) Server provides Kubernetes tools for the agents.
#### Clone the MCP Server repository
```bash
git clone https://github.com/Flux159/mcp-server-kubernetes.git
```
#### Update the schema (Required)
```bash
python3 -c "
import json
with open('mcp-server-kubernetes/helm-chart/values.schema.json') as f:
schema = json.load(f)
schema['properties']['observability'] = {
'type': 'object',
'additionalProperties': True
}
with open('mcp-server-kubernetes/helm-chart/values.schema.json', 'w') as f:
json.dump(schema, f, indent=2)
print('Schema updated successfully!')
"
```
#### Install the MCP Server
```bash
helm install mcp-server ./mcp-server-kubernetes/helm-chart \
--set kubeconfig.provider=serviceaccount \
--set transport.mode=http \
--set transport.service.type=ClusterIP \
--set security.allowOnlyNonDestructive=false \
--create-namespace \
--namespace mcp-system
```
### Step 2: Install PostgreSQL Operator
The agents use PostgreSQL for long-term memory. We use the CrunchyData Operator to manage it.
#### Install OLM (Operator Lifecycle Manager)
```bash
curl -sL https://github.com/operator-framework/operator-lifecycle-manager/releases/download/v0.40.0/install.sh | bash -s v0.40.0
```
> [!IMPORTANT]
> OLM installation can take a few minutes. Please ensure all OLM pods in the `olm` namespace are in the **Running** state before proceeding.
#### Install the PostgreSQL Operator
```bash
kubectl create -f https://operatorhub.io/install/postgresql.yaml
```
### Step 3: Build Custom Images
Navigate to the repository root and build your images:
#### Build L0 Frontend
```bash
docker build -t your-registry/agent-l0:latest ./k8s-agent/level-0-agent
```
#### Build L1 Backend Agent
```bash
docker build -t your-registry/agent-l1:latest ./k8s-agent/level-1-agent
```
### Step 4: Push or Load Images
**For remote clusters** - Push to your registry:
```bash
docker push your-registry/agent-l0:latest
docker push your-registry/agent-l1:latest
```
**For local Kind clusters** - Load images directly:
```bash
kind load docker-image your-registry/agent-l0:latest --name 01cloud-cluster
kind load docker-image your-registry/agent-l1:latest --name 01cloud-cluster
```
### Step 5: Configure Agent Settings
Edit `helm-chart/values.yaml` to use your custom images and configure your LLM provider.
You must set the `MODEL_PROVIDER`, `MODEL_NAME`, `image` and provide the corresponding API key in the `secret` section.
```yaml
agents:
- name: l0
enabled: true
image: your-registry/agent-l0:latest # Your custom image
- name: l1
enabled: true
image: your-registry/agent-l1:latest # Your custom image
env:
MODEL_PROVIDER: deepseek # options: gemini, openai, openrouter, anthropic, deepseek
MODEL_NAME: deepseek-chat # examples: gemini-2.0-flash, gpt-4o, claude-3-5-sonnet
MCP_SERVER_URL: http://mcp-server-mcp-server-kubernetes.mcp-system.svc.cluster.local:3001/mcp
ENABLE_K8S_TOOLS: "true"
STM_ENABLE_POSTGRES: "true"
usePostgresql: true
secret:
DEEPSEEK_API_KEY: "your-api-key-here"
# GOOGLE_API_KEY: "your-api-key"
# OPENAI_API_KEY: "your-api-key"
# OPENROUTER_API_KEY: "your-api-key"
# ANTHROPIC_API_KEY: "your-api-key"
```
### Step 6: Deploy 01Agents
```bash
helm upgrade --install 01agent ./helm-chart -n 01cloud --create-namespace
```
> [!IMPORTANT]
> Ensure all the pods in the namespace have initialized successfully and running
### Step 7: Access the Application
Port-forward the UI service:
```bash
kubectl port-forward svc/agent-l0 -n 01cloud 3000:3000
```
Open your browser and navigate to `http://localhost:3000`
### Step 8: Monitoring & Observability (Optional)
For advanced monitoring with Grafana, Loki, Tempo, and OpenTelemetry, see [OTEL-setup.md](./OTEL-setup.md).
---
## 🔌 Local Development Tools
Access these services via port-forwarding for development and debugging:
```bash
# PostgreSQL Database
kubectl port-forward svc/agents-primary -n 01cloud 5432:5432
# MCP Server
kubectl port-forward svc/mcp-server-mcp-server-kubernetes -n mcp-system 3001:3001
```
---
## 📁 Repository Structure
- `helm-chart/`: Core Helm chart for deployment
- `k8s-agent/`: Source code for L0 and L1 agents
- `OTEL-setup.md`: Optional guide for monitoring (Prometheus, Grafana, Loki)
---
## 🤝 Contributing
We welcome contributions from the community! Whether you are reporting a bug, suggesting a feature, or submitting a pull request, your help is appreciated.
- **Found a bug?** [Open an issue](https://github.com/BerryBytes/01agent/issues/new?template=bug_report.md)
- **Have a feature idea?** [Suggest it here](https://github.com/BerryBytes/01agent/issues/new?template=feature_request.md)
- **Want to contribute code?** Check out our [Contributing Guidelines](./CONTRIBUTING.md)
This project is maintained by [Bishal Singh (@bsalsingh)](https://github.com/bsalsingh) and the 01Cloud community.
---
## 📜 Code of Conduct
To ensure a welcoming and inclusive community, please review and follow our [Code of Conduct](./CODE_OF_CONDUCT.md).
---
## ⚖️ License
This project is licensed under the [MIT License](./LICENSE.md).
---
## 📚 Component Documentation
- [L0 Client (Frontend)](k8s-agent/level-0-agent/README.md)
- [L1 Agent (Backend)](k8s-agent/level-1-agent/README.md)