https://github.com/amitbasuri/taskqueue-runner-go
Distributed background task processing for Go, backed by PostgreSQL and Kubernetes—retries, worker pool, locking, and real-time dashboard.
https://github.com/amitbasuri/taskqueue-runner-go
golang kubernetes postgresql task-queue
Last synced: 5 months ago
JSON representation
Distributed background task processing for Go, backed by PostgreSQL and Kubernetes—retries, worker pool, locking, and real-time dashboard.
- Host: GitHub
- URL: https://github.com/amitbasuri/taskqueue-runner-go
- Owner: amitbasuri
- License: mit
- Created: 2025-12-24T12:13:52.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2025-12-24T22:14:51.000Z (6 months ago)
- Last Synced: 2025-12-26T03:52:43.398Z (6 months ago)
- Topics: golang, kubernetes, postgresql, task-queue
- Language: Go
- Homepage:
- Size: 90.8 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Security: SECURITY.md
Awesome Lists containing this project
README
# TaskQueue-Go
[](https://golang.org)
[](LICENSE)
[](https://github.com/amitbasuri/taskqueue-runner-go/actions)
[](https://goreportcard.com/report/github.com/amitbasuri/taskqueue-runner-go)
> A production-ready distributed background task processing system built with Go, PostgreSQL, and Kubernetes.
**Key Features:**
- ✅ Priority-based task scheduling with FIFO ordering
- ✅ Automatic retries with exponential backoff
- ✅ Concurrent task processing with worker pool (5 goroutines/instance)
- ✅ Real-time monitoring dashboard with Server-Sent Events
- ✅ Distributed lock-based task claiming (prevents duplicate execution)
- ✅ Full Kubernetes deployment support with Helm
- ✅ Comprehensive integration tests (11/11 passing)
- ✅ Row-level locking with PostgreSQL FOR UPDATE SKIP LOCKED
---
## 📋 Table of Contents
- [About](#about)
- [Architecture](#architecture)
- [Design Choices](#design-choices)
- [Quick Start](#quick-start)
- [Testing](#testing)
- [API Reference](#api-reference)
- [Configuration](#configuration)
- [Contributing](#contributing)
- [License](#license)
---
## 📖 About
TaskQueue-Go is a production-ready distributed task queue system designed for reliability, scalability, and ease of use. Built with Go and PostgreSQL, it provides robust background job processing with automatic retries, real-time monitoring, and horizontal scalability.
**Perfect for:**
- Background job processing (emails, reports, data processing)
- Scheduled tasks and cron-like operations
- Distributed workflows requiring retry logic
- Systems needing audit trails and task history
- Microservices requiring asynchronous task processing
**Why TaskQueue-Go?**
- 🚀 **Production-Ready**: Comprehensive error handling, monitoring, and logging
- 🔒 **Reliable**: PostgreSQL-backed with ACID guarantees and row-level locking
- 📈 **Scalable**: Horizontal scaling with multiple worker instances
- 🎯 **Developer-Friendly**: Clear API, excellent documentation, easy deployment
- 🧪 **Well-Tested**: 11 comprehensive integration tests covering all scenarios
---
## 🏗️ Architecture
### System Overview
```
┌──────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ │ │ │ │ │
│ API Server │────────▶│ PostgreSQL │◀────────│ Workers │
│ (Producer) │ POST │ (Durable Queue) │ CLAIM │ (Consumers) │
│ │ Tasks │ │ Tasks │ │
└──────────────┘ └──────────────────┘ └─────────────────┘
```
### Components
| Component | Purpose | Technology |
|-----------|---------|------------|
| **API Server** | REST API for task management | Go, Chi router |
| **PostgreSQL** | Durable task queue and state storage | PostgreSQL 16 |
| **Workers** | Execute tasks with retry logic | Go, worker pool |
| **Dashboard** | Real-time monitoring UI | HTML/JS with SSE |
### How It Works
1. **Client** submits task via REST API (`POST /api/tasks`)
2. **API Server** validates and stores task in PostgreSQL with status `queued`
3. **Worker** polls database using `SELECT FOR UPDATE SKIP LOCKED`
4. **Worker** claims task (sets `locked_until` timestamp)
5. **Worker** executes task handler
6. **Worker** updates status to `succeeded` or schedules retry if failed
7. **Dashboard** displays real-time updates via Server-Sent Events
---
## 🎯 Design Choices
### . Dispatcher Pattern
**Problem:** Multiple workers polling database creates "thundering herd"
**Solution:** Single dispatcher goroutine + worker pool
```
Worker Process:
Dispatcher (1 goroutine) ──polls DB──> Claims 1 task
│
└──> Buffered Channel (size 10)
│
└──> Worker Pool (5 goroutines)
├─> Worker 1
├─> Worker 2
├─> Worker 3
├─> Worker 4
└─> Worker 5
```
**Benefits:**
- **1 DB query** instead of 50 per poll cycle
- **Buffered channel** provides backpressure
- **No worker starvation** - dispatcher ensures fair distribution
### 3. Exponential Backoff with Jitter
**Formula:**
```
backoff = min(2^retry_count, 2^20) seconds ± 25% jitter
minimum = 1 second
```
**Example retry schedule:**
- Attempt 1: ~1-1.5s delay
- Attempt 2: ~2-3s delay
- Attempt 3: ~4-6s delay
- Attempt 4: Max retries reached, task marked `failed`
**Why jitter?**
- Prevents retry storms (many tasks retrying simultaneously)
- Spreads load over time instead of synchronized spikes
### 4. Lock Expiration
**Problem:** Worker crashes while holding lock → task stuck forever
**Solution:** 30-second lock timeout
```sql
WHERE (status = 'queued')
OR (status = 'running' AND locked_until < NOW())
ORDER BY
CASE WHEN status = 'running' THEN 0 ELSE 1 END, -- Expired locks first
priority DESC,
created_at ASC
```
**Why prioritize expired locks?**
- Prevents task starvation
- Failed workers don't block the queue
- Respects original priority after recovery
### 5. SELECT FOR UPDATE SKIP LOCKED
**Problem:** Multiple workers trying to claim same task
**Solution:** PostgreSQL row-level locking with `SKIP LOCKED`
```sql
SELECT * FROM tasks
WHERE status = 'queued'
ORDER BY priority DESC
LIMIT 1
FOR UPDATE SKIP LOCKED
```
**How it works:**
- Worker A locks row → Worker B skips it, tries next row
- Zero contention between workers
- No deadlocks or retries needed
---
## 🚀 Quick Start
### Prerequisites
- Docker & Docker Compose
- Go 1.21+ (for local development)
- Node.js 18+ (for integration tests)
- kubectl, kind & Helm (for Kubernetes testing)
### Run with Docker Compose
```bash
# Start all services
docker-compose up -d
# Wait for startup
sleep 15
# Verify health
curl http://localhost:8080/health
# Open dashboard
open http://localhost:8080
```
### Run with Kubernetes (kind)
```bash
# Deploy to kind cluster with Bitnami PostgreSQL Helm chart
make run-k8s
# Automatically:
# - Creates kind cluster with port mappings
# - Deploys PostgreSQL via Bitnami Helm chart
# - Deploys server (2 replicas) and worker (3 replicas)
# - Configures NodePort service for external access
# Access the application
open http://localhost:8080
# View deployment status
kubectl get pods -n task-queue
# Check logs
kubectl logs -f deployment/task-queue-server -n task-queue
kubectl logs -f deployment/task-queue-worker -n task-queue
```
For detailed Kubernetes documentation, see [`k8s/README.md`](k8s/README.md).
---
## 🧪 Testing
### Run All Integration Tests
```bash
# Test Docker Compose deployment
make test-integration-docker
# Test Kubernetes deployment
make test-integration-k8s
```
### Test Results
All 11 integration tests pass successfully:
| Test | Description |
|------|-------------|
| Basic lifecycle | Task creation → execution → completion |
| Priority ordering | High-priority tasks processed first |
| Task failures | Retry logic and backoff |
| Failure types | Errors vs timeouts |
| Timeout retries | Slow tasks retry correctly |
| Retry history | Complete audit trail |
| Concurrent processing | 20 tasks with 5 workers |
| Statistics API | Metrics accuracy |
| Invalid task type | Error handling |
| Missing task | 404 responses |
**Test Coverage:**
- ✅ Task lifecycle (create, queue, run, complete)
- ✅ Priority scheduling
- ✅ Retry logic with exponential backoff
- ✅ Timeout handling
- ✅ Concurrent processing (no race conditions)
- ✅ Error handling
- ✅ Real-time statistics
### Manual Testing
```bash
# Create a task
curl -X POST http://localhost:8080/api/tasks \
-H "Content-Type: application/json" \
-d '{
"name": "Test Email",
"type": "send_email",
"priority": 5,
"payload": {
"to": "test@example.com",
"subject": "Test",
"body": "This is a test"
}
}'
# Get task status
curl http://localhost:8080/api/tasks/{task_id}
# Get task history
curl http://localhost:8080/api/tasks/{task_id}/history
# View statistics
curl http://localhost:8080/api/stats
```
---
## 📡 API Reference
### Create Task
**POST** `/api/tasks`
```json
{
"name": "Send Welcome Email",
"type": "send_email",
"priority": 5,
"payload": {
"to": "user@example.com",
"subject": "Welcome!",
"body": "Thanks for signing up."
}
}
```
**Response:**
```json
{
"id": "uuid",
"name": "Send Welcome Email",
"type": "send_email",
"status": "queued",
"priority": 5,
"retry_count": 0,
"max_retries": 3,
"created_at": "2025-12-06T10:00:00Z"
}
```
### Get Task
**GET** `/api/tasks/:id`
**Response:**
```json
{
"id": "uuid",
"status": "succeeded",
"retry_count": 1,
"started_at": "2025-12-06T10:00:05Z",
"completed_at": "2025-12-06T10:00:15Z"
}
```
### Get Task History
**GET** `/api/tasks/:id/history`
**Response:**
```json
[
{
"event_type": "task_queued",
"status": "queued",
"created_at": "2025-12-06T10:00:00Z"
},
{
"event_type": "task_started",
"status": "running",
"worker_id": "worker-123",
"created_at": "2025-12-06T10:00:05Z"
},
{
"event_type": "task_succeeded",
"status": "succeeded",
"created_at": "2025-12-06T10:00:15Z"
}
]
```
### Get Statistics
**GET** `/api/stats`
**Response:**
```json
{
"total_tasks": 1000,
"queued_tasks": 10,
"running_tasks": 5,
"succeeded_tasks": 950,
"failed_tasks": 35,
"avg_retry_count": 0.45,
"tasks_with_retries": 300
}
```
### Health Check
**GET** `/health`
**Response:**
```json
{
"status": "healthy",
"database": "connected"
}
```
---
## ⚙️ Configuration
### Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `DB_HOST` | `localhost` | PostgreSQL host |
| `DB_PORT` | `5432` | PostgreSQL port |
| `DB_USERNAME` | `admin` | Database user |
| `DB_PASSWORD` | `admin` | Database password |
| `DB_DATABASE` | `tasks` | Database name |
| `SERVER_PORT` | `8080` | API server port |
| `WORKER_CONCURRENCY` | `5` | Worker pool size |
| `WORKER_POLL_INTERVAL` | `1` | Poll interval (seconds) |
| `WORKER_TIMEOUT` | `30` | Task timeout (seconds) |
### Docker Compose
Edit `docker-compose.yml` to adjust configuration:
```yaml
services:
worker:
environment:
WORKER_CONCURRENCY: "10"
WORKER_POLL_INTERVAL: "2"
```
### Kubernetes
Edit `k8s/manifests/worker-deployment.yaml`:
```yaml
spec:
replicas: 5 # Number of worker pods
template:
spec:
containers:
- name: worker
env:
- name: WORKER_CONCURRENCY
value: "10"
```
Configuration files are organized in `k8s/`:
- `manifests/` - Kubernetes YAML files (deployments, services, configs)
- `scripts/` - Deployment automation scripts
---
## 🔧 Development
### Project Structure
```
.
├── cmd/
│ ├── server/ # API server entry point
│ └── worker/ # Worker entry point
│
├── internal/
│ ├── api/ # HTTP handlers and routes
│ ├── config/ # Configuration
│ ├── models/ # Domain models (Task, History)
│ ├── storage/ # Data access layer
│ │ └── postgres/ # PostgreSQL implementation
│ └── worker/ # Worker pool and task handlers
│ ├── worker.go # Dispatcher + worker pool
│ ├── registry.go # Handler registration
│ └── handlers/ # Task type implementations
│
├── db/
│ └── migrations/ # SQL schema migrations
│
├── k8s/ # Kubernetes deployment
│ ├── manifests/ # YAML files (deployments, services, configs)
│ ├── scripts/ # Deployment automation scripts
│ └── README.md # Kubernetes documentation
│
├── tests/ # Integration tests
├── web/ # Dashboard UI
│ ├── static/ # CSS, JavaScript
│ └── templates/ # HTML templates
│
├── docker-compose.yml
└── Makefile
```
Please review these documents before contributing:
- [CONTRIBUTING.md](CONTRIBUTING.md) — guidelines, development setup, and workflow
- [CODE_OF_CONDUCT.md](CODE_OF_CONDUCT.md) — community standards and enforcement
- [SECURITY.md](SECURITY.md) — how to report vulnerabilities
### Makefile Commands
make fmt # Format Go code
make lint # Run linters (golangci-lint)
```bash
# Quick Start
make help # Show all commands
make docker-up # Start Docker Compose
make run-k8s # Deploy to Kubernetes (one command!)
make test-integration # Run integration tests
# Build
make build # Build server and worker binaries
make docker-build # Build Docker images
# Testing
make test-integration-docker # Test Docker Compose
make test-integration-k8s # Test Kubernetes
make check-tools # Verify prerequisites
# Kubernetes
make k8s-down # Stop Kubernetes
# Database
make migrate-up # Run migrations
make migrate-down # Rollback last migration
# Cleanup
make clean # Remove build artifacts
```
---
## 📊 Monitoring
### Dashboard
Access the real-time dashboard at: **http://localhost:8080/**
Features:
- Live task statistics (updated via Server-Sent Events)
- Success rate visualization
- Retry metrics
- Auto-refresh every 5 seconds
### Logs
**Docker Compose:**
```bash
docker-compose logs -f server
docker-compose logs -f worker
```
**Kubernetes:**
```bash
kubectl logs -f deployment/task-queue-worker -n task-queue
kubectl logs -f deployment/task-queue-server -n task-queue
# View all pods
kubectl get pods -n task-queue
# Describe a specific pod
kubectl describe pod -n task-queue
```
---
## 🏗️ Why This Design?
### Key Architectural Decisions
1. **PostgreSQL**
- Simplicity: One database for everything
- Durability: No separate persistence layer needed
- Rich queries: Task history, statistics, complex filtering
2. **Dispatcher Pattern**
- Prevents database connection storm
- Reduces DB load by 98% (1 query vs 50 queries/second)
- Buffered channel provides natural backpressure
3. **Exponential Backoff + Jitter**
- Prevents retry storms
- Gradually increases delay for persistent failures
- Jitter spreads load over time
4. **Lock Expiration**
- Auto-recovery from worker crashes
- No manual intervention needed
- Tasks never stuck permanently
5. **Context Cancellation**
- Graceful shutdown
- In-flight tasks complete properly
- Safe for Kubernetes rolling updates
6. **Kubernetes with kind & Bitnami PostgreSQL**
- Production-ready: Bitnami Helm chart with security updates
- Easy local testing: kind runs K8s in Docker containers
- Horizontal scaling: Server (2 replicas) + Worker (3 replicas)
- Self-healing: Automatic pod restart on failures
- Organized structure: Separate manifests/ and scripts/ directories
---
## 🤝 Contributing
Contributions are welcome! Here's how you can help:
1. **Fork** the repository
2. **Create** a feature branch (`git checkout -b feature/amazing-feature`)
3. **Commit** your changes (`git commit -m 'Add amazing feature'`)
4. **Push** to the branch (`git push origin feature/amazing-feature`)
5. **Open** a Pull Request
### Development Setup
```bash
make setup # Install dependencies
make test-integration # Run tests
```
## 📄 License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## 👤 Author
**Amit Basuri**
- GitHub: [@amitbasuri](https://github.com/amitbasuri)
## 🌟 Show Your Support
Give a ⭐️ if this project helped you!
---
**Built with ❤️ using Go, PostgreSQL, and Kubernetes**