{"id":16732541,"url":"https://github.com/antonmi/job_reactor","last_synced_at":"2025-04-10T11:27:07.148Z","repository":{"id":3281724,"uuid":"4321942","full_name":"antonmi/job_reactor","owner":"antonmi","description":"Simple, powerful and high scalable job queueing and background workers system based on EventMachine","archived":false,"fork":false,"pushed_at":"2020-02-29T08:29:10.000Z","size":242,"stargazers_count":6,"open_issues_count":1,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-03-24T10:12:26.253Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/antonmi.png","metadata":{"files":{"readme":"README.markdown","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2012-05-14T09:09:55.000Z","updated_at":"2016-11-12T01:10:33.000Z","dependencies_parsed_at":"2022-08-20T23:10:43.279Z","dependency_job_id":null,"html_url":"https://github.com/antonmi/job_reactor","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/antonmi%2Fjob_reactor","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/antonmi%2Fjob_reactor/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/antonmi%2Fjob_reactor/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/antonmi%2Fjob_reactor/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/antonmi","download_url":"https://codeload.github.com/antonmi/job_reactor/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248208666,"owners_count":21065203,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-12T23:45:29.870Z","updated_at":"2025-04-10T11:27:07.125Z","avatar_url":"https://github.com/antonmi.png","language":"Ruby","funding_links":[],"categories":[],"sub_categories":[],"readme":"JobReactor \u003cimg src='https://secure.travis-ci.org/antonmi/job_reactor.png'\u003e\n==========\n\nJobReactor is a library for creating, scheduling and processing background jobs.\nIt is asynchronous client-server distributed system based on [EventMachine][0].\n\nJobReactor is the best solution for I/O intensive web application powered by evented web servers such [Thin][12].\nWhile all requests are processed in one thread you should avoid blocking reactor loop.\nSo you should (and must) delegate intensive calculations to another process.\n\n__Use JobReactor to avoid blocking calculations.__\n\nJobReactor keeps you in the evented paradigm by allowing to register callback for the tasks which will be triggered when work is done.\nSee simple example of using with AsyncSinatra [here][13].\n\nTo use JobReactor with [Sinatra][11] or [Ruby on Rails][9] you should start distributor in initializer using `JR.run` method (it launches EventMachine in separate thread).\nThen add rake task(s) which will run the node(s).\n\nIf you use server based on EventMachine ([Thin][12]) use `JR.wait_em_and_run` method which will initialize JobReactor when EventMachine started.\n\nSo, read the 'features' section and try JobReactor. You can do a lot with it.\n\nNote\n====\nJobReactor is based on [EventMachine][0]. Jobs are launched in EM reactor loop in one thread.\nThere are advantages and disadvantages. The main benefit is fast scheduling, saving and loading.\nThe weak point is the processing of heavy background jobs when each job takes minutes and hours.\nThey will block the reactor and break normal processing.\n\nIf you can't divide 'THE BIG JOB' into 'small pieces' you shouldn't use JobReactor. See alternatives such [DelayedJob][4] or [Resque][1].\n\n__JobReactor is the right solution if you have thousands, millions, and, we hope, billions relatively small jobs.__\n\nQuick start\n===========\n```gem install job_reactor```\n\n__You should install [Redis][5] if you want to persist your jobs.__\n\n```$ sudo apt-get install redis-server```\n\nIn your main application:\n`application.rb`\n``` ruby\nrequire 'job_reactor'\n\nJR.run do\n  JR.start_distributor('localhost', 5000)  #see lib/job_reactor/job_reactor.rb\nend\n\nsleep(1) until(JR.ready?)\n\n# The application\nloop do\n  sleep(3) #Your application is working\n  JR.enqueue 'my_job', {arg1: 'Hello'}\nend\n```\nDefine the 'my_job' in separate directory (files with job's definitions **must** be in separate directory):\n`reactor_jobs/my_jobs.rb`\n``` ruby\ninclude JobReactor\n\njob 'my_job' do |args|\n  puts args[:arg1]\nend\n```\nAnd the last file - 'the worker code':\n`worker.rb`\n``` ruby\nrequire 'job_reactor'\n\nJR.config[:job_directory] = 'reactor_jobs' #this default config, so you can omit this line\n\nJR.run! do\n  JR.start_node({\n  :storage =\u003e 'memory_storage',\n  :name =\u003e 'worker_1',\n  :server =\u003e ['localhost', 5001],\n  :distributors =\u003e [['localhost', 5000]]\n  })                                         #see lib/job_reactor/job_reactor.rb\nend\n```\nRun 'application.rb' in one terminal window and 'worker.rb' in another.\nNode connects to distributor, receives the job and works.\nCool! But it was the simplest example. See 'examples' directory.\n\nFeatures\n=============\n1. Client-server architecture\n-----------------------------\nYou can run as many distributors and working nodes as you need. You are free to choose the strategy.\nIf you have many background tasks from each part of your application you can use, for example, 3 distributors (one in each process) and 10 working nodes.\nIf you don't have many jobs you can leave only one node which will be connected to 3 distributors.\n2. High scalability\n-------------------\nNodes and distributors are connected via TCP. So, you can run them on any machine you can connect to.\nNodes may use different storage or the same one. You can store vitally important jobs in database and\nsimple insignificant jobs in memory.\nAnd more: You can run node and distributor inside one EMreactor, so your nodes may create jobs for others nodes and communicate with each other.\n3. Full job control\n-------------------\nYou can add 'callback' and 'errbacks' to the job which will be called on the node.\nYou also can add 'success feedback' and 'error feedback' which will be called in your main application.\nWhen job is done on remote node, your application will receive the result inside corespondent 'feedback'.\nIf error occur in the job you can see it in 'errbacks' and the in 'error feedback' and do what you want.\n4. Reflection and modifying\n---------------------------\nInside the job you can get information about when it starts, when it fails, which node execute job and etc.\nYou also can add some arguments to the job on-the-fly which will be used in the subsequent callbacks and errbacks.\nThese arguments then can be sent back to the distributor.\n5. Reliability\n--------------\nYou can run additional nodes and stop any nodes on-the-fly.\nDistributor is smart enough to send jobs to another node if someone is stopped or crashed.\nIf no nodes are connected to distributor it will keep jobs in memory and send them when nodes start.\nIf node is stopped or crashed it will retry stored jobs after start.\n6. EventMachine available\n-------------------------\nRemember, your jobs will be run inside EventMachine reactor! You can easily use the power of async nature of EventMachine.\nUse asynchronous [em-http-request][6], [em-websocket][7], and etc.\n7. Thread safe\n--------------\nEventMachine reactor loop runs in one thread. So the code in jobs executed in the given node is absolutely threadsafe.\nThe only exception is 'defer' job, when you tell the node to run job in EM.defer block (so job will be executed in separate thread).\n8. Deferred and periodic jobs\n-----------------------------\nYou can use deferred jobs which will run 'after' some time or 'run_at' given time.\nYou can create periodic jobs which will run every given time period and cancel them on condition.\n9. No polling\n-------------\nThere is no storage polling. Absolutely. When node receives job (no matter instant, periodic or deferred) there will be EventMachine timer created\nwhich will start job at the right time.\n10. Job retrying\n--------------\nIf job fails it will be retried. You can choose global retrying strategy or manage separate jobs.\n11. Predefined nodes\n-------------------\nYou can specify node for jobs, so they will be executed in that node environment. And you can specify which node is forbidden for the job.\nIf no nodes are specified distributor will try to send the job to the first free node.\n12. Node based priorities\n-----------------------\nThere are no priorities like in Delayed::Job or Stalker. But there are flexible node-based priorities.\nYou can specify the node which should execute the job and the node is forbidden for given job. You can reserve several nodes for high priority jobs.\n\nHow it works\n============\n1. You run JobReactor::Distributor in your application initializer\n-----------------------------------------------------\n``` ruby\nJR.run do\n  JR.start_distributor('localhost', 5000)\nend\n```\nThis code runs EventMachine reactor loop in the new thread and call the block given.\nJR.start_distributor starts EventMachine TCP server on given host and port.\nAnd now JobReactor is ready to work.\n\n2. You run JobReactor::Node in the different process or different machine\n------------------------------------------------------------------------\n\n``` ruby\nJR.run! do\n  JR.start_node({\n    storage: 'redis_storage',\n    name: 'redis_node1',\n    server: ['localhost', 5001],\n    distributors: [['localhost', 5000]] \n})\nend\n```\n\nThis code runs EventMachine reactor loop (in the main thread: this is the difference between `run` and `run!`).\nAnd start the Node inside the reactor.\nWhen node starts it:\n* parses the 'reactor jobs' files (recursively parse all files specified in JR.config[:job_directory] directory, default is 'reactor_jobs' directory) and create hash of jobs callbacks and errbacs (see [JobReator jobs]);\n* tries to 'retry' the job (if you use 'redis_storage' and `JR.config[:retry_jobs_at_start]` is true) \n* starts it's own TCP server;\n* connects to Distributor server and sends the information about needed to establish the connection;\nWhen distributor receives the credentials it connects to Node server. And now there is a full duplex-connection between Distributor and Node.\n\n3. You enqueue the job in your application\n------------------------------------------\n\n```ruby\nJR.enqueue('my_job',{arg1: 1, arg2: 2}, {after: 20}, success, error)\n```\n\nThe first argument is the name of the job, the second is the arguments hash for the job.\nThe third is the options hash. If you don't specify any option job will be instant job and will be sent to any free node. You can use the following options:\n* `defer: true or false` - node will run the job in 'EM.defer' block. Be careful, the default thread pool size is 20 for EM. You can increase it by setting EM.threadpool_size = 'your value', but it is not recommended;\n* `after: seconds` - node will try run the job after  `seconds` seconds;\n* `run_at: time` - node will try run the job at given time;\n* `period: seconds` - node will run job periodically, each `seconds` seconds;\nYou can add `node: 'node_name'` and `not_node: 'node_name'` to the options. This specify the node on which the job should or shouldn't be run. For example:\n\n```ruby\nJR.enqueue('my_job', {arg1: 1}, {period: 100, node: 'my_favourite_node', not_node: 'do_not_use_this_node'})\n```\n\nThe rule to use specified node is not strict if `JR.config[:always_use_specified_node]` is false (default).\nThis means that distributor will try to send the job to the given node at first. But if the node is `locked` (maybe you have just sent another job to it and it is very busy) distributor will look for other node.\n\nThe last two arguments are optional. The first is 'success feedback' and the last is 'error feedback'. We use term 'feedback' to distinguish from 'callbacks' and 'errbacks'. 'feedback' is executed on the main application side while 'callbacks' on the node side. 'feedbacks' are the procs which will be called when node sent message that job is completed (successfully or not). The argunments for the 'feedback' are the arguments of the initial job plus all added on the node side.\n\nExample:\n\n```ruby\n#in your 'job_file'\njob 'my_job' do |args|\n  #do smth\n  args.merge!(result: 'Yay!')\nend\n\n#in your application\n#success feedback\nsuccess = proc {|args| puts args}\n#enqueue job\nJR.enqueue('my_job', {arg1: 1}, {}, success)\n```\n\nThe 'success' proc args will be {arg1: 1, result: 'Yay!'}.\nThe same story is with 'error feedback'. __Note__, that error feedback will be launched after all attempts failed on the node side.\nSee config: `JR.config[:max_attempt] = 10` and `JR.config[:retry_multiplier]`\n\n4. You disconnect node (stop it manually or node fails itself)\n--------------------------------------------------------------\n* distributor will send jobs to any other nodes if present\n* distributor will store in memory enqueued jobs if there is no connected node (or specified node)\n* when node starts again, then distributor will send jobs to the node\n\n5. You stop the main application.\n---------------------------------\n* Nodes will continue to work, but you won't be able to receive the results from node when you start the application again because all feedbacks are stored in memory.\n\nCallbacks and feedbacks\n============================\n'callbacks', 'errbacks', 'success feedback', and 'error feedback' helps you divide the __job__ into small relatively independent parts.\n\nTo define `'job'` you use `JobReactor.job` method (see 'Quick start' section). The only arguments are 'job_name' and the block which is the job itself.\n\nYou can define any number of callbacks and errbacks for the given job. Just use `JobReactor.job_callback` and `JobRector.job_errback` methods. The are three arguments for calbacks and errbacks. The name of the job, the name of callback/errback (optional) and the block.\n\n```ruby\ninclude JobReactor\n\njob 'test_job' do |args|\n  puts \"job with args #{args}\" \nend\n\njob_callback 'test_job', 'first_callback' do |args|\n  puts \"first callback with args #{args}\"\nend\n\njob_callback 'test_job', 'second_callback' do |args|\n  puts \"second callback with args #{args}\"\nend\n\njob_errback 'test_job', 'first_errback' do |args|\n  puts \"first errback with error #{args[:error]}\"\nend\n\njob_errback 'test_job', 'second_errback' do |args|\n  puts 'another errback'\nend\n```\n\nCallbacks and errbacks acts as ordinary EventMachine::Deferrable callbacks and errbacks. The `'job'` is the first callack, first `'job_callback'` becomes second callback and so on. See `lib/job_reactor/job_reactor/job_parser.rb` for more information. When Node start job it calls `succeed` method on the 'job object' with given argument (args). This runs all callbacks sequentially. If error occurs in any callback Node calls `fail` method on the 'deferrable' object with the same args (plus merged `:error =\u003e 'Error message`).\n\n__Note__, you define jobs, callbacks and errbacks in top-level scope, so the `self` is `main` object.\n\nYou can `merge!` additional key-value pairs to 'args' in the job to exchange information between job and it's callbacks.\n\n```ruby\ninclude JobReactor\n\njob 'test_job' do |args|\n  args.merge!(result: 'Hello')\nend\n\njob_callback 'test_job', 'first_callback' do |args|\n  puts args[:result]\n  args.merge!(another_result: 'world')\nend\n\njob_callback 'test_job', 'second_callback' do |args|\n  puts \"#{args[:result]} #{args[:another_result]}\"\nend\n```\n__Note__, if error occurs you can't see additional arguments in job errbacks.\n\nAnother trick is `JR.config[:merge_job_itself_to_args]` option which is `false` by default. If you set this option to `true` you can see `:job_itself` key in `args`. The value contains many usefull information about job ('name', 'attempt', 'status', 'make_after', 'node', etc).\n\nFeedbacks are defined as a Proc object and attached to the 'job' when it is enqueued on the application side.\n\n```ruby\nsuccess = Proc.new { |args| puts 'Success' }\nerror = Proc.new { |args| puts 'Error' }\nJR.enqueue('my_job', {arg1: 1, arg2: 2}, {after: 100}, success, error)\n```\n\nThis procs will be called when Node informs about success or error. The 'args' for the corresponding proc will be the same 'args' which is in the job (and it's callbacks) on the node side. So you can, for example, return any result by merging it to 'args' in the job (or it's callbacks).\n\n__Note__, feedbacks are kept in memory in your application, so they disappear when you restart the application.\n\nJob Storage\n==========\nNow you can store your jobs in [Redis][5] storage (`'redis_storage`') ([em-hiredis][8]) or in memory (`'memory_storage'`).\nOnly the first, of course, 'really' persists the jobs. You can use the last one if you don't want install Redis, don't need retry jobs and need more speed (by the way, the difference in performance is not so great - Redis is very fast).\nYou can easily integrate your own storage. Just make it EventMachine compatible.\n\nThe default url for Redis server are:\n\n```ruby\nJR.config[:hiredis_url] = \"redis://127.0.0.1:6379/0\"\n```\n\nJobReactor works asynchronously with Redis using [em-hiredis][8] library to increase the speed.\nSeveral nodes can use one Redis storage.\n\nThe informaion about jobs is saved several times during processing. This information includes:\n* id - the unique job id;\n* name - job name which 'defines' the job;\n* args - serialized arguments for the job;\n* run_at - the time when job was launched;\n* failed_at - the time when job was failed;\n* last_error - the error occured;\n* period - period (for periodic jobs);\n* defer - 'true' or 'false', flag to run job in EM.defer block;\n* status - job status ('new', 'in progress', 'queued', 'complete', 'error', 'failed', 'cancelled');\n* attempt - the number of attempt;\n* make_after - when to start job again (in seconds after last save);\n* distributor - host and port of distributor server which sent the job (used for 'feedbacks');\n* on_success - the unique id of success feedback on the distributor side;\n* on_error - the unique id of error feedback on the distributor side;\n\nBy default JobReactor deletes all completed and cancelled jobs, but you can configure it:\nThe default options are:\n\n```ruby\nJR.config[:remove_done_jobs] = true\nJR.config[:remove_cancelled_jobs] = true\nJR.config[:remove_failed_jobs] = false\nJR.config[:retry_jobs_at_start] = true\n```\n\nWe provide simple `JR::RedisMonitor` module to check the Redis storage from irb console (or from your app).\nWe use synchronous [redis](https://github.com/redis/redis-rb) gem.\nConnect to Redis by:\n\n```ruby\nJR.config[:redis_host] = 'localhost'\nJR.config[:redis_port] = 6379\n```\n\n\n\nSee methods:\n\n```ruby\nJR::RedisMonitor.jobs_for(node_name)\nJR::RedisMonitor.load(job_id)\nJR::RedisMonitor.destroy(job_id)\nJR::RedisMonitor.destroy_all_jobs_for(node_name)\n```\n\n\nLicense\n=======\nThe MIT License - Copyright (c) 2012-2013 Anton Mishchuk\n\n[0]: http://rubyeventmachine.com\n[1]: https://github.com/defunkt/resque\n[2]: http://kr.github.com/beanstalkd/\n[3]: https://github.com/han/stalker\n[4]: https://github.com/tobi/delayed_job\n[5]: http://redis.io\n[6]: https://github.com/igrigorik/em-http-request\n[7]: https://github.com/igrigorik/em-websocket\n[8]: https://github.com/mloughran/em-hiredis\n[9]: http://rubyonrails.org/\n[10]: http://code.macournoyer.com/thin/\n[11]: http://www.sinatrarb.com/\n[12]: http://code.macournoyer.com/thin/\n[13]: https://github.com/antonmi/sinatra_with_job_reactor\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fantonmi%2Fjob_reactor","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fantonmi%2Fjob_reactor","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fantonmi%2Fjob_reactor/lists"}