{"id":13455009,"url":"https://github.com/nodeca/idoit","last_synced_at":"2025-07-17T19:35:31.991Z","repository":{"id":54518101,"uuid":"70803356","full_name":"nodeca/idoit","owner":"nodeca","description":"Redis-backed task queue engine with advanced task control and eventual consistency","archived":false,"fork":false,"pushed_at":"2023-06-19T03:04:01.000Z","size":216,"stargazers_count":77,"open_issues_count":2,"forks_count":9,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-04-12T02:38:56.813Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nodeca.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null},"funding":{"open_collective":"puzrin","patreon":"puzrin"}},"created_at":"2016-10-13T12:18:46.000Z","updated_at":"2024-11-14T07:20:18.000Z","dependencies_parsed_at":"2022-08-13T18:30:36.851Z","dependency_job_id":null,"html_url":"https://github.com/nodeca/idoit","commit_stats":{"total_commits":91,"total_committers":4,"mean_commits":22.75,"dds":0.3626373626373627,"last_synced_commit":"94e3bc5768c1a5963cdcc34a94102017411b47d9"},"previous_names":[],"tags_count":12,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nodeca%2Fidoit","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nodeca%2Fidoit/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nodeca%2Fidoit/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nodeca%2Fidoit/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nodeca","download_url":"https://codeload.github.com/nodeca/idoit/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248507244,"owners_count":21115562,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-31T08:01:00.299Z","updated_at":"2025-04-12T02:39:04.348Z","avatar_url":"https://github.com/nodeca.png","language":"JavaScript","funding_links":["https://opencollective.com/puzrin","https://patreon.com/puzrin"],"categories":["Packages","Repository","包","目录","JavaScript","Job queues"],"sub_categories":["Job queues","Job Queues","任务队列","工作队列"],"readme":"idoit\n=====\n\n[![CI](https://github.com/nodeca/idoit/workflows/CI/badge.svg)](https://github.com/nodeca/idoit/actions)\n[![NPM version](https://img.shields.io/npm/v/idoit.svg?style=flat)](https://www.npmjs.org/package/idoit)\n[![Coverage Status](https://coveralls.io/repos/github/nodeca/idoit/badge.svg?branch=master)](https://coveralls.io/github/nodeca/idoit?branch=master)\n\n\n\u003e Redis-backed task queue engine with advanced task control and eventual consistency.\n\n- Task grouping, chaining, iterators for huge ranges.\n- Postponed \u0026 scheduled task run.\n- Load distribution + worker pools.\n- Easy to embed.\n\n\nFeatures in details\n-------------------\n\n`idoit` provides advanced control to implement so\n\n**Grouping**. Special `group` task execute children tasks and wait until all\ncomplete. Useful for map/reduce logic.\n\n**Chaining**. Special `chain` task execute children one-by-one. Also useful for\nmap-reduce or splitting very complicated tasks to more simple steps.\n\n**Mapping iterator**. Special feature for huge payloads, to produce chunks on\ndemand. Benefits:\n\n- No lags on mapping phase, chunks processing starts immediately.\n- Easy to optimize DB queries to build chunks of equal size\n  (skip + limit queries are very slow on huge data).\n\n**Progress**. When you use groups/chain/map scenarios, it's easy to monitor\ntotal progress via top parent. Long standalone tasks also can notify user about\nprogress change.\n\n**Worker pools**. You can split tasks by different processes. For example,\nif you don't wish heavy tasks to block light ones.\n\n**Sheduler**. Built-in cron allows to execute tasks on given schedule.\n\n\n## Data consistency\n\n- All data in redis are evertually consistent.\n- Task can not be lost, but CAN run twice on edge cases (if process crash when\n  task function was about to finish)\n- Progress can count \"faster\" if `task.progressAdd()` used and process crash\n  before task complete. But that's not critical, since such info can be used\n  only for interface progress bars updates. In most cases you will not see the\n  difference.\n\n\nInstall\n-------\n\n`node.js` 6+ and `redis` 3.0+ required.\n\n```sh\nnpm install idoit --save\n```\n\n\nAPI\n---\n\n### new Queue({ redisURL, concurrency = 100, name = 'default', ns = 'idoit:' })\n\n - **redisURL** (String) - redis connection url.\n - **concurrency** (Number) - max tasks to consume in parallel\n   by single worker, 100 by default.\n - **pool** (String) - worker pool name, \"default\" if not set. Used if\n   this queue instance consumes tasks only (after `.start()`). You\n   can route tasks to specific pools of workers to avoid unwanted locks.\n   You can set `pool` to Array, `[ 'pool1', 'pool2' ]` to consume tasks from\n   several pools (for development/testing purposes).\n - **ns** (String) - data namespace, currently used as redis keys prefix,\n   \"idoitqueue:\" by default.\n\nIt's a good practice to have separate worker pools for heavy blocking tasks and\nnon-blocking ones. For example, nobody should block sending urgent emails. So,\ncreate several worker processes, pin those to different pools and set proper\ntasks concurrency. Non-blocking tasks can be cunsumed in parallel, and you can\nbe ok with default `concurrency` = 100. Blocking tasks should be consumed\none-by-one, set `concurrency` = 1 for those workers.\n\n__Note.__ It may happen, that you remove some task types from your app. In this\ncase orphaned data will be wiped after 3 days.\n\n\n### .registerTask(options), .registerTask(name [, cron], process)\n\nOptions:\n\n - **name** (String) - the task's name.\n - **baseClass** (Function) - optional, base task's constructor, \"Task\" by default.\n - **init** (Function) - optional, used for async task initialization, should return `Promise`\n   - **this** (Object) - current task (task total is available as `this.total`).\n - **taskID** (Function) - optional, should return new task id. Needed only for\n   creating \"exclusive\" tasks, return random value by default, called as:\n   `function (taskData)`. Sugar: if you pass plain string, it will be wrapped to\n   function, that always return this string.\n - **process** (Function) - main task function, called as: `task.process(...args)`. Should return `Promise`\n   - **this** (Object) - current task.\n - **retry** (Number) - optional, number of retry on error, default 2.\n - **retryDelay** (Number) - optional, delay in ms after retries, default 60000 ms.\n - **timeout** (Number) - optional, execution timeout, default 120000 ms.\n - **total** (Number) - optional, max progress value, default 1. If you don't\n   modify behaviour progress starts with 0 and become 1 on task end.\n - **postponeDelay** (Number) - optional, if postpone is called without delay,\n   delay is assumed to be equal to this value (in milliseconds).\n - **cron** (String) - optional, cron string (\"15 */6 * * *\"), default null.\n - **track** (Number) - default 3600000ms (1hr). Time to remember scheduled\n   tasks from cron to avoid rerun if several servers in cluster have wrong\n   clocks. Don't set too high for very frequent tasks, because it can occupy\n   a lot of memory.\n\n### .getTask(id)\n\nGet task by id. Returns a Promise resolved with task or with `null` if task not exist.\n\n__Task fields you can use:__\n\n- **total** - total task progress\n- **progress** - current task progress\n- **result** - the task result\n- **error** - the task error\n\n\n### .cancel(id)\n\nCancel task. Returns a Promise resolved with task.\n\n__Note.__ You can cancel only tasks without parent.\n\n\n### .start()\n\nStart worker and begin task data consume. Return `Promise`, resolved when queue\nis ready (call `.ready()` inside).\n\nIf `pool` was specified in cunstructor, only tasks routed to this pull will\nbe consumed.\n\n\n### .shutdown()\n\nStop accepting new tasks from queue. Return `Promise`, resolved when all active\ntasks in this worker complete.\n\n\n### .ready()\n\nReturn `Promise`, resolved when queue is ready to operate (after 'connect'\nevent, see below).\n\n\n### .options(opts)\n\nUpdate constructor options, except redisURL.\n\n\n### .on('eventName', handler)\n\n`idoit` is an `EventEmitter` instance that fires some events:\n\n- `ready` when redis connection is up and commands can be executed\n  (tasks can be registered without connection)\n- `error` when an error has occured.\n- `task:progress`, `task:progress:\u003ctask_id\u003e` - when task update progress.\n  Event data is: { id, uid, total, progress }\n- `task:end`, `task:end:\u003ctask_id\u003e` - when task end. Event data is: { id, uid }\n\n\n### .\\\u003ctaskName\\\u003e(...params)\n\nCreate new Task with optional params.\n\n\n### task.options({ ... })\n\nOverride task properties. For example, you may wish to assign specific\ngroup/chain tasks to another pool.\n\n\n### task.run()\n\nRun task immediately. Returns a Promise resolved with task id.\n\n\n### task.postpone([delay])\n\nPostpone task execution to `delay` milliseconds (or to `task.postponeDelay`).\n\nReturns a Promise resolved with task id.\n\n\n### task.restart([add_retry] [, delay])\n\nRestart currently running task.\n\n- **add_retry** (Boolean) - optional, whether to increase retry count or not\n  (default: false)\n  - if `true`, retry count is increased, and task doesn't get restarted in case\n    it's exceeded\n  - if `false`, retry count stays the same, so a task can restart itself\n    indefinitely\n- **delay** (Number) delay before restart in milliseconds\n  (default: `task.retryDelay`).\n\nNote, `idoit` already has built-in restart logic on task errors. Probably,\nyou should not use this method directly. It's exposed for very specific cases.\n\n\n### task.progressAdd(value)\n\nIncrement current task progress.\n\nReturns a Promise resolved with task id.\n\n\n### task.setDeadline(timeLeft)\n\nUpdate current task deadline.\n\nReturns a Promise resolved with task id.\n\n\nSpecial tasks\n-------------\n\n### group\n\nCreate a new task, executing children in parallel.\n\n```javascript\nqueue.group([\n  queue.children1(),\n  queue.children2(),\n  queue.children3()\n]).run()\n```\n\nGroup result is unsorted array of children result.\n\n### chain\n\nCreate a new task, executing children in series. If any of children fails -\nchain fails too.\n\n```javascript\nqueue.registerTask('multiply', (a, b) =\u003e a * b);\nqueue.registerTask('subtract', (a, b) =\u003e a - b);\n\nqueue.chain([\n  queue.multiply(2, 3), // 2 * 3 = 6\n  queue.subtract(10),   // 10 - 6 = 4\n  queue.multiply(3)     // 3 * 4 = 12\n]).run()\n```\n\nResult of previous task pass as last argument of next task. Result of chain\nis result of last task in chain.\n\n### iterator\n\nA special way to run huge mapping in lazy style (on demand). See comments below.\n\n```javascript\n// register iterator task\nqueue.registerTask({\n  name: 'lazy_mapper',\n  baseClass: Queue.Iterator,\n  // This method is called on task begin and on every child end. It can be\n  // a generator function or function that return `Promise`.\n  * iterate(state) {\n    // ...\n\n    // Three types of output states possible: ended, do nothing \u0026 new data.\n    //\n    // 1. `null` - end reached, iterator should not be called anymore.\n    // 2. `{}`   - idle, there are enougth subtasks in queue, try to call\n    //             iterator later (when next child finish).\n    // 3. {\n    //      state    - new iterator state to remember (for example, offset for\n    //                 db query), any serializeable data\n    //      tasks    - array of new subtasks to push into queue\n    //    }\n    //\n    // IMPORTANT! Iterator can be called in parallel from different workers. We\n    // use input `state` to resolve collisions on redis update. So, if you\n    // create new subtasks:\n    //\n    // 1. new `state` MUST be different (for all previous states)\n    // 2. `tasks` array MUST NOT be empty.\n    //\n    // In other case you should signal about 'end' or 'idle'.\n    //\n    // Invalid combination will cause 'end' + error event.\n    //\n    return {\n      state: newState,\n      tasks: chunksArray\n    };\n  }\n});\n\n// run iterator\nqueue.lazy_mapper().run();\n```\n\nWhy this crazy magic was invented?\n\nImagine that you need to rebuild 10 millions of forum posts. You wish to split\nwork by equal small chunks, but posts have no sequential integer enumeration,\nonly mongo IDs. What can you do?\n\n- Direct `skip` + `limit` requests are very expensive on big collections in any\n  database.\n- You can not split by date intervals, because posts density varies a lot from\n  first to last post.\n- You can add an indexed field with random number to every post. Then split by\n  intervals. That will work, but will cause random disk access - not cool.\n\nSolution is to use iterative mapper, wich can remember \"previous position\". In\nthis case, you will do `range` + `limit` requests instead of `skip` + `limit`.\nThat works well with databases. Additional bonuses are:\n\n- You do not need to keep all subtasks in queue. For example, you can create\n  100 chunks and add next 100 when previous are about to finish.\n- Mapping phase become distributed and you can start monitoring total progress\n  immediately.\n\n\nDev notes\n---------\n\nQuik-run redis via docker:\n\n```sh\n# start\ndocker run -d -p 6379:6379 --name redis1 redis\n# stop\ndocker stop redis1\ndocker rm redis1\n```\n\n\nWhy one more queue?\n-------------------\n\nOf cause, we are familiar with [kue](https://github.com/Automattic/kue),\n[celery](http://www.celeryproject.org/) and [akka](http://akka.io/).\nOur target was have a balance between simplicity and power. So, we don't know\nif `idoit` works well in cluster with thousands of instances. But it should be ok\nin smaller volumes and it's really easy to use.\n\n[kue](https://github.com/Automattic/kue) was not ok for our needs, because:\n\n- it's \"priorities\" concept is not flexible and does not protect well from\n  locks by heavy tasks\n- no task grouping/chaining and so on\n- no strong guarantees of data consistency\n\nIn `idoit` we cared about:\n\n- task group/chain operations \u0026 pass data between tasks (similar to celery)\n- worker pools to isolate task execution by types.\n- easy to use \u0026 install (only redis needed, can run in existing process)\n- eventual consistency of stored data\n- essential sugar like built-in scheduler\n- iterative mapper for huge payloads (unique feature, very useful for many\n  maintenance tasks)\n- task progress tracking\n- avoid global locks\n\nRedis still can be a point of failure, but that's acceptable price for\nsimplicity. Of cause you can get a better availability via distributed message\nbuses like RMQ. But in many cases it's more important to keep things simple.\nWith `idoit` you can reuse existing technologies without additional expences.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnodeca%2Fidoit","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnodeca%2Fidoit","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnodeca%2Fidoit/lists"}