https://github.com/persys-dev/persys-cloud

Community Driven Cloud Automation :)
https://github.com/persys-dev/persys-cloud

automation cloud cluster golang kubernetes pipelines platform sre

Last synced: 3 months ago
JSON representation

Community Driven Cloud Automation :)

Host: GitHub
URL: https://github.com/persys-dev/persys-cloud
Owner: persys-dev
Created: 2023-05-12T13:39:30.000Z (about 3 years ago)
Default Branch: main
Last Pushed: 2026-02-18T15:44:03.000Z (4 months ago)
Last Synced: 2026-02-18T18:39:57.266Z (4 months ago)
Topics: automation, cloud, cluster, golang, kubernetes, pipelines, platform, sre
Language: Go
Homepage: https://persys.cloud
Size: 1.61 MB
Stars: 4
Watchers: 1
Forks: 2
Open Issues: 0
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md

Awesome Lists containing this project

README

# Persys Compute

![License](https://img.shields.io/badge/license-MIT-blue.svg)
![Language](https://img.shields.io/badge/language-Go-00ADD8)
![Architecture](https://img.shields.io/badge/architecture-control--plane-orange)
![Transport](https://img.shields.io/badge/transport-gRPC%20%2B%20mTLS-green)

Persys Compute is a scheduler-driven distributed compute control plane
designed to orchestrate containers, Docker Compose applications, and
virtual machines across heterogeneous infrastructure.

It is built with a clear architectural philosophy:

- Centralized scheduler (authoritative desired state)
- Dumb but reliable agents (local execution only)
- Strongly typed gRPC contracts
- mTLS everywhere
- etcd-backed persistent cluster state
- Explicit reconciliation loops
- Resource-aware placement
- Lease-based node liveness model

This repository currently lives under `persys-cloud`, but the product
name is **Persys Compute**.

------------------------------------------------------------------------

## Vision

Persys Compute is not just a container orchestrator.

It is a programmable compute control plane designed to:

- Run Docker containers
- Run Docker Compose stacks (from Git or inline spec)
- Provision and manage virtual machines
- Enforce CPU, memory, disk limits
- Reconcile drift automatically
- Scale from a single bare-metal server to multi-node clusters
- Support hybrid and federated deployments

The long-term goal is hyperscaler-style infrastructure for private and
hybrid environments --- without unnecessary abstraction or
overengineering.

------------------------------------------------------------------------

## Core Architecture

Persys Compute follows a strict control-plane model.

### Northbound (User → Control Plane)

User → Gateway (HTTP + mTLS) → Scheduler

- REST APIs for platform access
- Authentication & service identity
- Workload submission
- Administrative operations

### Southbound (Control Plane → Nodes)

Scheduler ⇄ Agent (gRPC + mTLS)

- Node registration
- Heartbeat & lease model
- Workload lifecycle execution
- Status reporting

Agents never schedule themselves. Scheduler owns all placement
decisions.

------------------------------------------------------------------------

## Control Plane Components

### persys-scheduler

Authoritative cluster brain.

Responsibilities:

- Node registration & lease management
- Workload scheduling
- Resource-aware placement
- Desired-state reconciliation
- Retry & backoff policies
- etcd state persistence
- Event emission
- Health & metrics endpoints

### persys-gateway

Platform entrypoint.

Responsibilities:

- REST APIs
- mTLS authentication
- Request routing to scheduler
- Webhook ingestion (future use)
- Platform-level authorization

### vault

Certificate authority bootstrap.

- Service identity issuance
- mTLS trust chain
- Agent certificate provisioning

### persys-federation

Future hybrid/multi-cloud integration layer.

- Connect to AWS/GCP/other providers
- Offload workloads
- Aggregate compute resources

### persys-agent (runtime node agent)

Responsibilities:

- Register with scheduler
- Maintain heartbeat
- Enforce resource limits
- Execute containers / compose / VMs
- Report workload status
- Perform local garbage collection
- Expose Prometheus metrics

Agent is intentionally simple and execution-focused.

------------------------------------------------------------------------

## Scheduling Model

1. Agent boots
2. Agent registers with scheduler via gRPC
3. Scheduler stores node in etcd and issues lease
4. Agent sends periodic heartbeat
5. Scheduler tracks node liveness
6. User submits workload
7. Scheduler selects node based on:
- CPU availability
- Memory availability
- Disk pools
- Labels
- Capability matching
8. Scheduler sends ApplyWorkload
9. Agent executes and reports status
10. Reconciler enforces desired state

------------------------------------------------------------------------

## Supported Workload Types

### Containers

- Image-based
- Resource limits
- Env variables
- Volumes
- Ports
- Restart policies
- Privileged mode (optional)

### Docker Compose

- Git-based deployments
- Inline YAML support
- Environment injection
- Secret injection (future: Vault integration)
- Mixed public/private images

### Virtual Machines

- vCPU & memory specification
- Disk provisioning via storage pools
- Cloud-init injection
- Network configuration
- Login credential provisioning
- Future: IP reporting back to scheduler

------------------------------------------------------------------------

### Cluster State Model (etcd)

Scheduler persists:

- /nodes/``{=html}
- /nodes/``{=html}/lease
- /workloads/``{=html}
- /assignments/``{=html}
- /events/``{=html}
- /retries/``{=html}

This enables:

- Auditability
- Recovery after scheduler restart
- Drift detection
- Retry tracking
- Failure history

------------------------------------------------------------------------

## Resource Enforcement

Agent enforces:

- CPU limits
- Memory limits
- Disk allocation
- System threshold rejection (default 80% utilization)
- Orphan resource cleanup
- Zombie workload detection

Workloads are rejected early if capacity is insufficient.

------------------------------------------------------------------------

## Reliability Model

Persys Compute uses:

- Lease-based node liveness
- Heartbeat TTL enforcement
- Idempotent workload operations
- Explicit failure reasons (enum-based)
- Garbage collection loops
- Structured error propagation
- Metrics-first observability

No ghost workloads. No silent failures. No hidden retries.

------------------------------------------------------------------------

## Observability

Each component exposes:

- /metrics (Prometheus)
- /health
- Structured logs

Metrics include:

- Workload counts
- Failure reasons
- Apply duration
- Resource utilization
- Runtime health status
- GC statistics

------------------------------------------------------------------------

## Local Development

### 1. Start full stack

```bash
cd infra/docker
docker compose up -d --build
```

### 2. Build CLI

```bash
cd persysctl
go build -o ./bin/persysctl
```

### 3. Quick smoke commands

```bash
# Cluster view from gateway
./bin/persysctl --transport http cluster list

# List nodes/workloads routed through gateway
./bin/persysctl --transport http node list
./bin/persysctl --transport http workload list
```

------------------------------------------------------------------------

## Project Philosophy

Persys Compute is built around:

- Explicit contracts (protobuf-first design)
- Control-plane correctness
- Minimal but powerful abstraction
- Hyperscaler-inspired architecture
- Avoiding unnecessary AI buzz
- Strong separation of concerns
- Scalability from day one

This is infrastructure designed to scale --- without rewriting the
system when adding more nodes.

------------------------------------------------------------------------

## Roadmap Highlights

- Storage pool full implementation
- VM network introspection & IP reporting
- Retry engine with exponential backoff
- Federation workload offloading
- Secrets integration (Vault)
- Stream-based control channel
- Multi-cluster support

------------------------------------------------------------------------

## License

MIT

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/persys-dev/persys-cloud

Awesome Lists containing this project

README