{"id":24716782,"url":"https://github.com/bayajidalam/r-queue","last_synced_at":"2025-03-22T09:21:15.873Z","repository":{"id":271916429,"uuid":"913482457","full_name":"BayajidAlam/r-queue","owner":"BayajidAlam","description":" A distributed job queue system using Redis, designed to manage tasks across multiple worker nodes in a cloud environment. It supports job prioritization, failure handling, retries, result storage, and monitoring.","archived":false,"fork":false,"pushed_at":"2025-01-16T21:08:51.000Z","size":301,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-27T09:14:17.824Z","etag":null,"topics":["ansible","aws","node-multi-threading","pulumi","redis-cluster","redis-queue"],"latest_commit_sha":null,"homepage":"","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/BayajidAlam.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-01-07T19:13:23.000Z","updated_at":"2025-01-16T21:08:52.000Z","dependencies_parsed_at":"2025-01-10T18:49:37.949Z","dependency_job_id":null,"html_url":"https://github.com/BayajidAlam/r-queue","commit_stats":null,"previous_names":["bayajidalam/r-queue"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BayajidAlam%2Fr-queue","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BayajidAlam%2Fr-queue/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BayajidAlam%2Fr-queue/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BayajidAlam%2Fr-queue/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/BayajidAlam","download_url":"https://codeload.github.com/BayajidAlam/r-queue/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244932957,"owners_count":20534265,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ansible","aws","node-multi-threading","pulumi","redis-cluster","redis-queue"],"created_at":"2025-01-27T09:14:19.647Z","updated_at":"2025-03-22T09:21:15.835Z","avatar_url":"https://github.com/BayajidAlam.png","language":"TypeScript","readme":"![image](https://github.com/user-attachments/assets/017cba47-9cb6-4489-805c-8abbe5fcca9e)\n# R-queue - A Distributed Job Queue System with Redis\n\nA scalable, fault-tolerant distributed job queue system using Redis to manage tasks across worker nodes with job tracking, retries, prioritization, and a dashboard for monitoring and health checks.\n\n\n- [Architecture Overview](#architecture-overview)\n- [Features](#features)\n- [Getting Started](#getting-started)\n- [Folder Structure](#folder-structure)\n- [API Endpoints](#api-endpoints)\n- [Deployments](#deployments)\n- [Test The Application](#test-the-application)\n  \n### Objective\n\nThe objective is to design and implement a distributed job queue system using Redis that can:\n\n- Distribute computational tasks dynamically across multiple worker nodes.\n-  Track job statuses (`pending`, `processing`, `completed`, `failed`) and handle failures through retries or alternative mechanisms.\n-  Provide a user-friendly dashboard for real-time monitoring of job statuses and worker health.\n-  Support horizontal scaling of worker nodes and job prioritization.\n\n### Core Challenges\n\n-  Ensuring fault-tolerance and graceful recovery from worker or network failures.\n-  Efficiently managing a distributed queue to handle job priorities and dependencies.\n-  Implementing a robust retry mechanism for failed jobs and a dead-letter queue for irrecoverable tasks.\n-  Storing and retrieving job results in a scalable manner.\n-  Handling dynamic workload variations and enabling worker auto-scaling based on queue length.\n\n### Additional Features (Bonus Challenges)\n\n-  Implementing job dependencies where certain jobs can only start after others are completed.\n-  Tracking real-time job progress for better monitoring and debugging.\n\n## Architecture Overview\n\n![image](https://github.com/user-attachments/assets/06b1320d-c7cd-4cca-a851-6ff10c636c31)\n\n**1. Frontend:**  \nA React.js application with an intuitive interface for monitoring and managing the system, providing:  \n- Worker health and active worker status.  \n- Queue length and benchmarking of jobs.  \n- Total jobs (processing, completed, failed).  \n- Detailed view of all jobs (pending, processing, canceled, failed, completed) with type, status, progress, and priorities, including dynamic pagination and filtering by parameters.  \n- Input modal for simulating jobs.\n\n**2. Backend:**\n\n\n **3. Cloud Infrastructure**: \n  \n  \n- **Networking**: \n  - AWS VPC for managing network configurations.\n  - AWS EC2 for hosting the application instances.\n  - AWS Security Groups for managing access control to EC2 instances.\n  - AWS NAT Gateway for enabling internet access from private subnets.\n\n- **DevOps**: \n  - Pulumi as IAC to manage AWS resources and automate deployments.\n\n## Features\n\n- Priority-based job scheduling\n- Automatic worker scaling (1-10 workers)\n- Job retry with exponential backoff\n- Dead letter queue for failed jobs\n- Real-time job progress tracking\n- Worker health monitoring\n- Comprehensive metrics collection\n- Circuit breaker pattern implementation\n- Job dependency management\n\n\n## Getting Started\nFollow these steps to run the application locally\n\n**1. Clone the Repository**\n\n```bash\n  git clone https://github.com/BayajidAlam/r-queue\n  cd r-queue\n\n```\n\n**2. Install Dependencies**\n\n```bash\n  cd client\n  yarn install\n\n```\n\n**3. Set Up Environment Variables**\n\nCreate a **.env** file in the **/client** directory and add this: \n```bash\nVITE_PUBLIC_API_URL=backend url\n\n```\n\n**4. Run the server**\n\n```bash\n  yarn dev\n\n```\n\n\n### To run backend follow this steps:\n**1. Install Dependencies**\n\n```bash\n  cd server\n  yarn install\n\n```\n#### 2. Create a **.env** file in the **/server** directory and add this:\n\n```bash\nREDIS_HOST=localhost \nPORT=5000\n```\n\n**3. Navigate to docker compose folder and run all container:**\n\n```bash\ncd docker-compose.yml\ndocker-compose up -d\n```\nYou will see like this:\n![image](https://github.com/user-attachments/assets/318d4df9-5119-418d-8afd-efa04ad92e90)\n\n\n\n**4. Run following command to create cluster:**\n```bash\nredis-cli --cluster create \\\n  \u003cnode-1 IP\u003e:6379 \u003cnode-2 IP\u003e:6379 \u003cnode-3 IP\u003e:6379 \\\n  \u003cnode-4 IP\u003e:6379 \u003cnode-5 IP\u003e:6379 \u003cnode-6 IP\u003e:6379 \\\n  --cluster-replicas 1\n\n```\nyou will see something like this:\n![image](https://github.com/user-attachments/assets/279a9e59-ebe2-4231-b81d-c8ffe3fbda8d)\n\n\n**6. Verify the Cluster**\n```bash\nredis-cli -c cluster nodes\n\n```\n\n**7. Now run the server and test your applicaion:**\n```bash\nyarn dev\n```\n\n\nYou will see something like this:\n![image](https://github.com/user-attachments/assets/8c8b0df6-147d-4d8a-b09f-2c5b111f236d)\n\n\n\n## Folder Structure\n\n- `/client` : **Frontend**\n  - `/public`: Static files and assets.\n  - `/src`: Application code.\n  - `.env`: Frontend environment variables\n  - `package.json`\n-  `/server`: **Backend**\n    - `/src`: Backend source code.\n        - `bulkJobSimulation.ts`: Script for creating bulk amount job\n   - `docker-compose`: For creating redis cluster in docker environment locally \n    - `.env`: Backend environment variables\n   - `package.json`\n\n\n- `/IaC`: **Infrastructure** \n    - `/pulumi`:\n        - `index.ts`: Pulumi IaC files for managing AWS resources includes networking, compute to create distributed redis cluster.\n    - `ansible`: Ansible files for create and configure frontend, backend, redis setup and redis-cluster.\n        \n\n\n\n\n## API Endpoints\nThe application have following API's\n\n### Root url(Local environment)\n\n```\n  http://localhost:5000/api\n\n```\n### Check health (GET)\nAPI Endpoint:\n```\n    http://localhost:5000/api/health\n```\n\n#### Response would be like this\n```\n{\n    \"status\": \"unhealthy\",\n    \"details\": {\n        \"redisConnected\": true,\n        \"activeWorkers\": 0,\n        \"queueLength\": 0,\n        \"processingJobs\": 0,\n        \"metrics\": {\n            \"avgProcessingTime\": 0,\n            \"errorRate\": 0,\n            \"throughput\": 0\n        }\n    },\n    \"timestamp\": \"2025-01-10T12:20:37.856Z\",\n    \"version\": \"1.0\"\n}\n```\n\n\n\n### Add new job (POST)\nAPI Endpoint:\n```\n  http://localhost:5000/api/jobs\n```\n\n### Examples\n\nFor register a user your request body should be like following\n\n#### Reqeust body\n\n```\n{\n    \"type\": \"email\",\n    \"data\": {\n        \"Hello\": \"Hello\",\n        \"world\": \"world\"\n    },\n    \"priority\": 3,\n\n    \"dependencies\": [\n        \"a3342ec2-fcae-4e8d-8df8-8f59a2c7d58c\"\n    ]\n}\n```\n\n#### Response  would be like this\n```\n{\n    \"acknowledged\": true,\n    \"insertedId\": \"675002aea8b348ab91f524d0\"\n}\n```\n\n## Prerequisites\n\nBefore deploying the application, ensure you have the following:\n\n- An **AWS account** with EC2 setup permissions.\n- **Docker** installed on your local machine for building containers.\n- **AWS CLI** installed and configured with your credentials.\n- **Node.js** (version 18 or above) and **npm** and **yarn** installed for both frontend and backend applications.\n- **Pulumi** installed for managing AWS infrastructure as code.\n- **TypeScript** installed on your computer\n\n\n## Deployments\n**1. Clone the Repository**\n\n```bash\n  git clone https://github.com/BayajidAlam/r-queue\n  cd r-queue/IaC/pulumi\n\n```\n\n**2. Configure AWS CLI**\n\nProvide Access Key and Secret Key \n![image](https://github.com/user-attachments/assets/d8c35819-7182-4629-a7f4-54010ba175d2)\n\n**3. Create Key Pair**\n\nCreate a new key pair for our instances using the following command:\n\n```bash\naws ec2 create-key-pair --key-name MyKeyPair --query 'KeyMaterial' --output text \u003e MyKeyPair.pem\n```\n\n**3. Deploy the infrastructure**\n\n```bash\npulumi up\n```\n\nYou will see like this:\n![Screenshot from 2025-01-12 01-10-56](https://github.com/user-attachments/assets/adeaa6a4-93f7-4280-bdb3-9a9d35b59bbc)\n\nOn your AWS VPC resources map will be like:\n![image](https://github.com/user-attachments/assets/944bbc00-397e-492e-bfa7-1ba071a653bf)\n\nEC2 dashboard will be like:\n![image](https://github.com/user-attachments/assets/acb9855a-e821-4328-973b-754ef9f138a9)\n\n**4. Run the Ansible Playbook**\nFirst navigate to ansible directory in pulumi and give following command\n```bash\nansible-playbook -e @vars.yml playbooks/redis-setup.yml\nansible-playbook -e @vars.yml playbooks/redis-cluster.yml\nansible-playbook -e @vars.yml playbooks/backend-setup.yml\nansible-playbook -e @vars.yml playbooks/frontend-setup.yml\n```\n\nNow access frontend using user \u003cEC2 pulic id\u003e:5173 and you will see like :\n![Screenshot from 2025-01-17 02-24-00](https://github.com/user-attachments/assets/8fcbd8ca-b8f7-4d11-815c-642762564313)\n\n\n### Test The Application\n\n**Create a job:** Click on **Add new job** modal and give necessary input:\n\n**Job type:** What type of job you want to simulate\n\n**Processing Time:** How long the job will take to complete process\n\n**Priority:** Priority of the job\n\n**Job Data (JSON):** Data we are passing with the job\n\n**Dependencies (comma-separated job IDs):** If the job is dependent to another job add ID here form Recent Activity dashboard\n\n\n**Simulate Failure:** If you want to simulate a failure check this\n \n![Screenshot from 2025-01-17 02-28-07](https://github.com/user-attachments/assets/2ddddd93-0b9e-4246-b849-0d5e347dc650)\n\nNow click on add new job button\n\n![image](https://github.com/user-attachments/assets/4511e78e-1ee0-4639-9ca3-1c738c16e971)\n\nSummary: \n- One Active worker\n- Processing 1\n- One item is showing in Recent Job\n- After the job processing is done, completed = 1.\nUsing this job ID you can create a new job with dependencies. And selecting Simulate you can create a job that will fail at the end.\n\n\n**Test with bulk input:**\nFirst ssh to your backend ec2, navigate to /opt/r-queue/server/src and run the command \n```bash\nsimulate 20 2 10\n```\n```bash\nsimulate totalJobs duration batchSize\n```\nYou will see in terminal:\n![Screenshot from 2025-01-17 02-46-33](https://github.com/user-attachments/assets/0f5c44c9-c55f-48ee-b568-4354a18b6987)\n\nDashboard will look like:\n![Screenshot from 2025-01-17 02-46-43](https://github.com/user-attachments/assets/21ca6e15-cb09-4928-86f0-ba41d89bca80)\n\nWhen all process are done it will look like \n![image](https://github.com/user-attachments/assets/3efc0f00-0eaa-4412-80b9-62d75ecf5fe2)\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbayajidalam%2Fr-queue","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbayajidalam%2Fr-queue","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbayajidalam%2Fr-queue/lists"}