{"id":13413422,"url":"https://github.com/hyperonym/ratus","last_synced_at":"2025-03-14T19:32:14.123Z","repository":{"id":57995709,"uuid":"528791750","full_name":"hyperonym/ratus","owner":"hyperonym","description":"Ratus is a RESTful asynchronous task queue server. It translated concepts of distributed task queues into a set of resources that conform to REST principles and provides a consistent HTTP API for various backends.","archived":false,"fork":false,"pushed_at":"2024-10-16T00:23:17.000Z","size":1005,"stargazers_count":109,"open_issues_count":4,"forks_count":7,"subscribers_count":4,"default_branch":"master","last_synced_at":"2024-10-17T12:04:37.416Z","etag":null,"topics":["background-jobs","distributed-systems","go","golang","mongodb","priority-queue","restful-api","swagger","task","task-queue","task-scheduler"],"latest_commit_sha":null,"homepage":"https://hyperonym.github.io/ratus/","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hyperonym.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":["hyperonym"]}},"created_at":"2022-08-25T10:00:29.000Z","updated_at":"2024-10-16T00:23:19.000Z","dependencies_parsed_at":"2023-11-07T07:23:55.957Z","dependency_job_id":"32f612a6-2a7a-42c8-8e01-3d3536c68156","html_url":"https://github.com/hyperonym/ratus","commit_stats":{"total_commits":264,"total_committers":4,"mean_commits":66.0,"dds":0.07954545454545459,"last_synced_commit":"dc1063e3e3198d2fe16e7a558762d4f866417aa3"},"previous_names":[],"tags_count":18,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hyperonym%2Fratus","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hyperonym%2Fratus/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hyperonym%2Fratus/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hyperonym%2Fratus/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hyperonym","download_url":"https://codeload.github.com/hyperonym/ratus/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243635427,"owners_count":20322938,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["background-jobs","distributed-systems","go","golang","mongodb","priority-queue","restful-api","swagger","task","task-queue","task-scheduler"],"created_at":"2024-07-30T20:01:39.993Z","updated_at":"2025-03-14T19:32:13.680Z","avatar_url":"https://github.com/hyperonym.png","language":"Go","readme":"# Ratus\n\n[![Go](https://github.com/hyperonym/ratus/actions/workflows/go.yml/badge.svg)](https://github.com/hyperonym/ratus/actions/workflows/go.yml)\n[![codecov](https://codecov.io/gh/hyperonym/ratus/branch/master/graph/badge.svg?token=6HJKAQ9XR1)](https://codecov.io/gh/hyperonym/ratus)\n[![Go Reference](https://pkg.go.dev/badge/github.com/hyperonym/ratus.svg)](https://pkg.go.dev/github.com/hyperonym/ratus)\n[![Swagger Validator](https://img.shields.io/swagger/valid/3.0?specUrl=https%3A%2F%2Fraw.githubusercontent.com%2Fhyperonym%2Fratus%2Fmaster%2Fdocs%2Fswagger.json)](https://hyperonym.github.io/ratus/)\n[![Go Report Card](https://goreportcard.com/badge/github.com/hyperonym/ratus)](https://goreportcard.com/report/github.com/hyperonym/ratus)\n[![Status](https://img.shields.io/badge/status-beta-blue)](https://github.com/hyperonym/ratus)\n\nRatus is a RESTful asynchronous task queue server. It translated concepts of distributed task queues into a set of resources that conform to REST principles and provides a consistent [HTTP API](https://hyperonym.github.io/ratus/) for various backends.\n\nThe key features of Ratus are:\n\n* Self-contained binary with a fast in-memory storage.\n* Support multiple embedded or external storage engines.\n* Guaranteed at-least-once execution of tasks.\n* Unified model for prioritized and time-based scheduling.\n* Task-level timeout control with automatic recovery.\n* Language agnostic RESTful API with built-in Swagger UI.\n* Load balancing across a dynamic number of consumers.\n* Horizontal scaling through replication and partitioning.\n* Native support for Prometheus and Kubernetes.\n\n![Terminal screenshot](https://github.com/hyperonym/ratus/blob/master/docs/assets/terminal.png?raw=true)\n\n## Quick Start\n\n### Installation\n\nRatus offers a variety of installation options:\n\n* Docker images are available on [Docker Hub](https://hub.docker.com/r/hyperonym/ratus/tags) and [GitHub Packages](https://github.com/orgs/hyperonym/packages?repo_name=ratus).\n* Kubernetes and Docker Compose examples can be found in the [deployments](https://github.com/hyperonym/ratus/tree/master/deployments) directory.\n* Pre-built binaries for all major platforms are available on the [GitHub releases](https://github.com/hyperonym/ratus/releases) page.\n* Build from source with `go install github.com/hyperonym/ratus/cmd/ratus@latest`.\n\nRunning Ratus from the command line is as simple as typing:\n\n```bash\n$ ratus\n```\n\nThe above command will start an ephemeral Ratus instance using the default in-memory storage engine `memdb` and listen on the default HTTP port of **80**.\n\nTo use another port and enable on-disk snapshot for persistence, start Ratus with:\n\n```bash\n$ ratus --port 8000 --engine memdb --memdb-snapshot-path ratus.db\n```\n\nDepending on the [storage engine](https://github.com/hyperonym/ratus/blob/master/README.md#engines) you choose, you may also need to deploy the corresponding database or broker. Using the `mongodb` engine as an example, assuming the database is already running locally, then start Ratus with:\n\n```bash\n$ ratus --port 8000 --engine mongodb --mongodb-uri mongodb://127.0.0.1:27017\n```\n\n### Basic Usage\n\nConcepts introduced by Ratus will be **bolded** below, see [Concepts](https://github.com/hyperonym/ratus/blob/master/README.md#concepts) (*a.k.a cheat sheet*) to learn more.\n\n#### cURL\n\nA producer creates a new **task** and pushes it to the `example` **topic**:\n```bash\n$ curl -X POST -d '{\"payload\": \"hello world\"}' \"http://127.0.0.1:8000/v1/topics/example/tasks/1\"\n```\n\u003cdetails\u003e\n\u003csummary\u003eExample response\u003c/summary\u003e\n\n```json\n{\n\t\"created\": 1,\n\t\"updated\": 0\n}\n```\n\u003c/details\u003e\n\nA consumer can then make a **promise** to claim and execute the next task in the `example` topic:\n\n```bash\n$ curl -X POST \"http://127.0.0.1:8000/v1/topics/example/promises?timeout=30s\"\n```\n\u003cdetails\u003e\n\u003csummary\u003eExample response\u003c/summary\u003e\n\n```json\n{\n\t\"_id\": \"1\",\n\t\"topic\": \"example\",\n\t\"state\": 1,\n\t\"nonce\": \"e4SN6Si1nOnE53ou\",\n\t\"produced\": \"2022-07-29T20:00:00.0Z\",\n\t\"scheduled\": \"2022-07-29T20:00:00.0Z\",\n\t\"consumed\": \"2022-07-29T20:00:10.0Z\",\n\t\"deadline\": \"2022-07-29T20:00:40.0Z\",\n\t\"payload\": \"hello world\"\n}\n```\n\u003c/details\u003e\n\nAfter executing the task, remember to acknowledge Ratus that the task is `completed` using a **commit**:\n\n```bash\n$ curl -X PATCH \"http://127.0.0.1:8000/v1/topics/example/tasks/1\"\n```\n\u003cdetails\u003e\n\u003csummary\u003eExample response\u003c/summary\u003e\n\n```json\n{\n\t\"_id\": \"1\",\n\t\"topic\": \"example\",\n\t\"state\": 2,\n\t\"nonce\": \"\",\n\t\"produced\": \"2022-07-29T20:00:00.0Z\",\n\t\"scheduled\": \"2022-07-29T20:00:00.0Z\",\n\t\"consumed\": \"2022-07-29T20:00:10.0Z\",\n\t\"deadline\": \"2022-07-29T20:00:40.0Z\",\n\t\"payload\": \"hello world\"\n}\n```\n\u003c/details\u003e\n\nIf a commit is not received before the promised deadline, the state of the task will be set back to `pending`, which in turn allows consumers to try to execute it again.\n\n#### Go Client\n\nRatus comes with a [Go client library](https://pkg.go.dev/github.com/hyperonym/ratus) that not only encapsulates all API calls, but also provides idiomatic poll-execute-commit workflows like [Client.Poll](https://pkg.go.dev/github.com/hyperonym/ratus#Client.Poll) and [Client.Subscribe](https://pkg.go.dev/github.com/hyperonym/ratus#Client.Subscribe). The [examples](https://github.com/hyperonym/ratus/tree/master/examples) directory contains ready-to-run examples for using the library:\n\n* The [hello world](https://github.com/hyperonym/ratus/blob/master/examples/hello-world/main.go) example demonstrated the basic usage of the client library. \n* The [crawl frontier](https://github.com/hyperonym/ratus/blob/master/examples/crawl-frontier/main.go) example implemented a simple [URL frontier](https://en.wikipedia.org/wiki/Crawl_frontier) for distributed web crawlers. It utilized advanced features like concurrent subscribers and time-based task scheduling.\n\n## Concepts\n\n### Data Model\n\n* **[Task](https://pkg.go.dev/github.com/hyperonym/ratus#Task)** references an idempotent unit of work that should be executed asynchronously.\n* **[Topic](https://pkg.go.dev/github.com/hyperonym/ratus#Topic)** refers to an ordered subset of tasks with the same topic name property.\n* **[Promise](https://pkg.go.dev/github.com/hyperonym/ratus#Promise)** represents a claim on the ownership of an active task.\n* **[Commit](https://pkg.go.dev/github.com/hyperonym/ratus#Commit)** contains a set of updates to be applied to a task.\n\n### Workflow\n\n* **Producer** client pushes **tasks** with their desired date-of-execution (scheduled times) to a **topic**.\n* **Consumer** client makes a **promise** to execute a **task** polled from a **topic** and acknowledges with a **commit** upon completion.\n\n### Topology\n\n* Both **producer** and **consumer** clients can have multiple instances running simultaneously.\n* **Consumer** instances can be added dynamically to increase throughput, and **tasks** will be naturally load balanced among consumers.\n* **Consumer** instances can be removed (or crash) at any time without risking to lose the task being executing: a **task** that has not received a **commit** after the **promised** deadline will be picked up and executed again by other consumers.\n\n### Task States\n\n* **pending** (0): The task is ready to be executed or is waiting to be executed in the future.\n* **active** (1): The task is being processed by a consumer. Active tasks that have timed out will be automatically reset to the `pending` state. Consumer code should handle failure and set the state to `pending` to retry later if necessary.\n* **completed** (2): The task has completed its execution. If the storage engine implementation supports TTL, completed tasks will be automatically deleted after the retention period has expired.\n* **archived** (3): The task is stored as an archive. Archived tasks will never be deleted due to expiration.\n\n### Behavior\n\n* **Task IDs across all topics share the same namespace** ([ADR](https://github.com/hyperonym/ratus/blob/master/docs/ARCHITECTURAL_DECISION_RECORDS.md#task-ids-should-be-unique-across-all-topics)). Topics are simply subsets generated based on the `topic` properties of the tasks, so topics do not need to be created explicitly.\n* Ratus is a task scheduler when consumers can keep up with the task generation speed, or a priority queue when consumers cannot keep up with the task generation speed.\n* Tasks will not be executed until the scheduled time arrives. After the scheduled time, excessive tasks will be executed in the order of the scheduled time.\n\n## Engines\n\nRatus provides a consistent API for various backends, allowing users to choose a specific engine based on their needs without having to modify client-side code.\n\nTo use a specific engine, set the `--engine` flag or `ENGINE` environment variable to one of the following names:\n\n| Name | Persistence | Replication | Partitioning | Expiration |\n| --- | :---: | :---: | :---: | :---: |\n| `memdb` | ○/● | ○ | ○ | ● |\n| `mongodb` | ● | ● | ● | ● |\n\n### MemDB\n\n[![MemDB](https://github.com/hyperonym/ratus/actions/workflows/memdb.yml/badge.svg)](https://github.com/hyperonym/ratus/actions/workflows/memdb.yml)\n\nMemDB is the default storage engine for Ratus. It is implemented on top of [go-memdb](https://github.com/hashicorp/go-memdb), which is built on immutable radix trees. MemDB is suitable for development and **production environments where durability is not critical**.\n\n#### Persistence\n\nThe MemDB storage engine is ephemeral by default, but it also provides **snapshot-based persistence** options. By setting the `--memdb-snapshot-path` flag or `MEMDB_SNAPSHOT_PATH` environment variable to a non-empty file path, Ratus will write on-disk snapshots at an interval specified by `MEMDB_SNAPSHOT_INTERVAL`.\n\nMemDB does not write [Append-Only Files](https://redis.io/docs/manual/persistence/#aof-advantages) (AOF), which means in case of Ratus stopping working without a graceful shutdown for any reason you should be prepared to lose the latest minutes of data. If durability is critical to your workflow, switch to an external storage engine like `mongodb`.\n\n#### Implementation Details\n\n* **List operations are relatively expensive** as they require scanning the entire database or index until the required number of results are collected. Fortunately, these operations are not used in most scenarios.\n* Snapshotting is performed along with the periodic background jobs when appropriate. **Writing snapshot files may delay the execution of background jobs** if the amount of data is large.\n* Since the resolution of the scheduled time in MemDB is in millisecond level and is affected by the instance's own clock, **the order in which consumers receive tasks is not strictly guaranteed**.\n* TTL cannot be disabled for `completed` tasks, in order to preserve a task forever, set it to the `archived` state.\n\n### MongoDB\n\n[![MongoDB](https://github.com/hyperonym/ratus/actions/workflows/mongodb.yml/badge.svg)](https://github.com/hyperonym/ratus/actions/workflows/mongodb.yml)\n\nRatus works best with **MongoDB version ~4.4**. MongoDB 5.0+ is also supported but requires additional considerations, see [Implementation Details](https://github.com/hyperonym/ratus/blob/master/README.md#implementation-details-1) to learn more.\n\n\u003e 💭 **TL;DR** set `MONGODB_DISABLE_ATOMIC_POLL=true` when using Ratus with MongoDB 5.0+.\n\n#### Replication\n\nWhen using the MongoDB storage engine, the Ratus instance itself is stateless. For high availability, **start multiple instances of Ratus and connect them to the same MongoDB replica set**.\n\nAll Ratus instances should run behind load balancers configured with health checks. **Producer and consumer clients should connect to the load balancer**, not directly to the instances.\n\n#### Partitioning\n\nHorizontal scaling could be achieved through sharding the task collection. However, with the help of the TTL mechanism, **partitioning is not necessary in most cases**. The best performance and the strongest atomicity can only be obtained without sharding.\n\nIf the amount of data exceeds the capacity of a single node or replica set, choose from the following sharding options:\n\n* If there is a large number of topics, **use a hashed index on the `topic` field as the shard key**, this will also enable the best polling performance on a sharded cluster.\n* If there is a huge amount of tasks in a few topics, **use a hashed index on the `_id` field as the shard key**, this will also result in a more balanced data distribution.\n\n#### Implementation Details\n\n* When using the MongoDB storage engine, **tasks across all topics are stored in the same collection**.\n* Task is the only concrete data model in the MongoDB storage engine, while topics and promises are just conceptual entities for enforcing the RESTful design principles.\n* Since the resolution of the scheduled time in MongoDB is in millisecond level and is affected by the instance's own clock, **the order in which consumers receive tasks is not strictly guaranteed**.\n* TTL cannot be disabled for `completed` tasks, in order to preserve a task forever, set it to the `archived` state.\n* It is not recommended to upsert tasks on sharded collections using the `topic` field as the shard key. Due to MongoDB's own [limitations](https://www.mongodb.com/docs/v4.4/reference/method/db.collection.replaceOne/#shard-key-modification), atomic operations cannot be used in this case, and only a fallback scheme equivalent to delete before insert can be used, so atomicity and performance cannot be guaranteed. This problem can be circumvented by using simple inserts in conjunction with fine-tuned TTL settings.\n* By default, polling is implemented through `findAndModify`. In the event of a conflict, MongoDB's native [optimistic concurrency control](https://www.mongodb.com/docs/v4.4/faq/concurrency/#how-granular-are-locks-in-mongodb-) (OCC) will transparently retry the operation. But in MongoDB 5.0 and above, the retry will report a `WriteConflict` error in the database server's log (although the operation is still successful from the client's perspective). You can choose to ignore this error, or circumvent the problem by **setting `MONGODB_DISABLE_ATOMIC_POLL=true` when using MongoDB 5.0+**. This option will make Ratus to not use `findAndModify` for polling and instead rely on the application-level OCC layer to ensure atomicity.\n\n#### Index Models\n\nThe following indexes will be created on startup, unless `MONGODB_DISABLE_INDEX_CREATION` is set to `true`:\n\n| Key Patterns | Partial Filter Expression | TTL |\n| --- | --- | --- |\n| `{\"topic\": \"hashed\"}` | - | - |\n| `{\"topic\": 1, \"scheduled\": 1}` | `{\"state\": 0}` | - |\n| `{\"deadline\": 1}` | `{\"state\": 1}` | - |\n| `{\"topic\": 1}` | `{\"state\": 1}` | - |\n| `{\"consumed\": 1}` | `{\"state\": 2}` | `MONGODB_RETENTION_PERIOD` |\n\n## Observability\n\n### Metrics and Labels\n\nRatus exposes the following [Prometheus](https://prometheus.io) metrics on the `/metrics` endpoint:\n\n| Name | Type | Labels |\n| --- | --- | --- |\n| **ratus_request_duration_seconds** | histogram | `topic`, `method`, `endpoint`, `status_code` |\n| **ratus_chore_duration_seconds** | histogram | - |\n| **ratus_task_schedule_delay_seconds** | gauge | `topic`, `producer`, `consumer` |\n| **ratus_task_execution_duration_seconds** | gauge | `topic`, `producer`, `consumer` |\n| **ratus_task_produced_count_total** | counter | `topic`, `producer` |\n| **ratus_task_consumed_count_total** | counter | `topic`, `producer`, `consumer` |\n| **ratus_task_committed_count_total** | counter | `topic`, `producer`, `consumer` |\n\n### Liveness and Readiness\n\nRatus supports [liveness and readiness probes](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/) via HTTP GET requests:\n\n* The `/livez` endpoint returns a status code of **200** if the instance is running.\n* The `/readyz` endpoint returns a status code of **200** if the instance is ready to accept traffic.\n\n## Caveats\n\n* 🚨 **Topic names and task IDs must not contain plus signs ('+') due to [gin-gonic/gin#2633](https://github.com/gin-gonic/gin/issues/2633).**\n* It is not recommended to use Ratus as the primary storage of tasks. Instead, consider storing the complete task record in a database, and **use a minimal descriptor as the payload for Ratus.**\n* Ratus is a simple and efficient alternative to task queues like [Celery](https://docs.celeryq.dev/). Consider to use [RabbitMQ](https://www.rabbitmq.com/) or [Kafka](https://kafka.apache.org/) if you need high-throughput message passing without task management.\n\n## Frequently Asked Questions\n\nFor more details, see [Architectural Decision Records](https://github.com/hyperonym/ratus/blob/master/docs/ARCHITECTURAL_DECISION_RECORDS.md).\n\n### Why HTTP API?\n\n\u003e Asynchronous task queues are typically used for long-running background tasks, so the overhead of HTTP is not significant compared to the time spent by the tasks themselves. On the other hand, the HTTP-based RESTful API can be easily accessed by all languages without using dedicated client libraries.\n\n### How to poll from multiple topics?\n\n\u003e If the number of topics is limited and you don't care about the priority between them, you can choose to create multiple threads/goroutines to listen to them simultaneously. Alternatively, you can create a ***topic of topics*** to get the topic names in turn and then get the next task from the corresponding topic.\n\n## Roadmap\n\n- [x] Storage engine options\n\t- [x] MemDB\n\t\t- [x] Ephemeral\n\t\t- [x] Persistence with snapshots\n\t\t- [ ] Persistence with AOF\n\t- [x] MongoDB\n\t\t- [x] Standalone\n\t\t- [x] Replica set\n\t\t- [x] Sharded cluster\n\t- [ ] Redis\n\t\t- [ ] Standalone\n\t\t- [ ] Sentinel\n\t\t- [ ] Cluster\n\t- [ ] RDBMS\n\t\t- [ ] MySQL\n\t\t- [ ] PostgreSQL\n\t- [ ] Message broker\n\t\t- [ ] RabbitMQ\n\t\t- [ ] Amazon SQS\n- [ ] Multi-language documents\n\t- [x] English\n\t- [ ] Chinese\n\nSee the [open issues](https://github.com/hyperonym/ratus/issues) for a full list of proposed features.\n\n## Contributing\n\nThis project is open-source. If you have any ideas or questions, please feel free to reach out by creating an issue!\n\nContributions are greatly appreciated, please refer to [CONTRIBUTING.md](https://github.com/hyperonym/ratus/blob/master/CONTRIBUTING.md) for more information.\n\n## License\n\nRatus is available under the [Apache License 2.0](https://github.com/hyperonym/ratus/blob/master/LICENSE).\n\n---\n\n© 2022-2024 [Hyperonym](https://hyperonym.org)\n","funding_links":["https://github.com/sponsors/hyperonym"],"categories":["Messaging","消息"],"sub_categories":["Search and Analytic Databases","检索及分析资料库"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhyperonym%2Fratus","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhyperonym%2Fratus","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhyperonym%2Fratus/lists"}