{"id":13484444,"url":"https://github.com/chaps-io/gush","last_synced_at":"2026-04-06T00:03:13.836Z","repository":{"id":15144727,"uuid":"17872111","full_name":"chaps-io/gush","owner":"chaps-io","description":"Fast and distributed workflow runner using ActiveJob and Redis","archived":false,"fork":false,"pushed_at":"2025-03-19T08:19:41.000Z","size":458,"stargazers_count":1079,"open_issues_count":15,"forks_count":110,"subscribers_count":21,"default_branch":"master","last_synced_at":"2025-04-29T13:12:47.392Z","etag":null,"topics":["activejob","graph","parallel","parallelization","queues","redis","ruby","sidekiq","workers","workflow","workflows"],"latest_commit_sha":null,"homepage":"","language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/chaps-io.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2014-03-18T15:59:18.000Z","updated_at":"2025-04-12T16:48:10.000Z","dependencies_parsed_at":"2024-05-01T13:19:50.313Z","dependency_job_id":"74c24bdb-e46e-417b-855c-381aede3eb80","html_url":"https://github.com/chaps-io/gush","commit_stats":{"total_commits":319,"total_committers":25,"mean_commits":12.76,"dds":0.3667711598746082,"last_synced_commit":"f6045547af91a726ac3361f258e2177f415fc174"},"previous_names":["pokonski/gush"],"tags_count":12,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chaps-io%2Fgush","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chaps-io%2Fgush/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chaps-io%2Fgush/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chaps-io%2Fgush/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/chaps-io","download_url":"https://codeload.github.com/chaps-io/gush/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254036807,"owners_count":22003651,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["activejob","graph","parallel","parallelization","queues","redis","ruby","sidekiq","workers","workflow","workflows"],"created_at":"2024-07-31T17:01:24.526Z","updated_at":"2026-04-06T00:03:13.830Z","avatar_url":"https://github.com/chaps-io.png","language":"Ruby","readme":"# Gush\n\n![Gem Version](https://img.shields.io/gem/v/gush)\n![GitHub Actions Workflow Status](https://img.shields.io/github/actions/workflow/status/chaps-io/gush/ruby.yml)\n\n\nGush is a parallel workflow runner using only Redis as storage and [ActiveJob](http://guides.rubyonrails.org/v4.2/active_job_basics.html#introduction) for scheduling and executing jobs.\n\n## Theory\n\nGush relies on directed acyclic graphs to store dependencies, see [Parallelizing Operations With Dependencies](https://msdn.microsoft.com/en-us/magazine/dd569760.aspx) by Stephen Toub to learn more about this method.\n\n## **WARNING - version notice**\n\nThis README is about the latest `master` code, which might differ from what is released on RubyGems. See tags to browse previous READMEs.\n\n## Installation\n\n### 1. Add `gush` to Gemfile\n\n```ruby\ngem 'gush', '~\u003e 5.0'\n```\n\n### 2. Create `Gushfile`\n\nWhen using Gush and its CLI commands you need a `Gushfile` in the root directory.\n`Gushfile` should require all your workflows and jobs.\n\n#### Ruby on Rails\n\nFor RoR it is enough to require the full environment:\n\n```ruby\nrequire_relative './config/environment.rb'\n```\n\nand make sure your jobs and workflows are correctly loaded by adding their directories to autoload_paths, inside `config/application.rb`:\n\n```ruby\nconfig.autoload_paths += [\"#{Rails.root}/app/jobs\", \"#{Rails.root}/app/workflows\"]\n```\n\n#### Ruby\n\nSimply require any jobs and workflows manually in `Gushfile`:\n\n```ruby\nrequire_relative 'lib/workflows/example_workflow.rb'\nrequire_relative 'lib/jobs/some_job.rb'\nrequire_relative 'lib/jobs/some_other_job.rb'\n```\n\n\n## Example\n\nThe DSL for defining jobs consists of a single `run` method.\nHere is a complete example of a workflow you can create:\n\n```ruby\n# app/workflows/sample_workflow.rb\nclass SampleWorkflow \u003c Gush::Workflow\n  def configure(url_to_fetch_from)\n    run FetchJob1, params: { url: url_to_fetch_from }\n    run FetchJob2, params: { some_flag: true, url: 'http://url.com' }\n\n    run PersistJob1, after: FetchJob1\n    run PersistJob2, after: FetchJob2\n\n    run Normalize,\n        after: [PersistJob1, PersistJob2],\n        before: Index\n\n    run Index\n  end\nend\n```\n\nand this is how the graph will look like:\n\n```mermaid\ngraph TD\n    A{Start} --\u003e B[FetchJob1]\n    A --\u003e C[FetchJob2]\n    B --\u003e D[PersistJob1]\n    C --\u003e E[PersistJob2]\n    D --\u003e F[NormalizeJob]\n    E --\u003e F\n    F --\u003e G[IndexJob]\n    G --\u003e H{Finish}\n```\n\n\n## Defining workflows\n\nLet's start with the simplest workflow possible, consisting of a single job:\n\n```ruby\nclass SimpleWorkflow \u003c Gush::Workflow\n  def configure\n    run DownloadJob\n  end\nend\n```\n\nOf course having a workflow with only a single job does not make sense, so it's time to define dependencies:\n\n```ruby\nclass SimpleWorkflow \u003c Gush::Workflow\n  def configure\n    run DownloadJob\n    run SaveJob, after: DownloadJob\n  end\nend\n```\n\nWe just told Gush to execute `SaveJob` right after `DownloadJob` finishes **successfully**.\n\nBut what if your job must have multiple dependencies? That's easy, just provide an array to the `after` attribute:\n\n```ruby\nclass SimpleWorkflow \u003c Gush::Workflow\n  def configure\n    run FirstDownloadJob\n    run SecondDownloadJob\n\n    run SaveJob, after: [FirstDownloadJob, SecondDownloadJob]\n  end\nend\n```\n\nNow `SaveJob` will only execute after both its parents finish without errors.\n\nWith this simple syntax you can build any complex workflows you can imagine!\n\n#### Alternative way\n\n`run` method also accepts `before:` attribute to define the opposite association. So we can write the same workflow as above, but like this:\n\n```ruby\nclass SimpleWorkflow \u003c Gush::Workflow\n  def configure\n    run FirstDownloadJob, before: SaveJob\n    run SecondDownloadJob, before: SaveJob\n\n    run SaveJob\n  end\nend\n```\n\nYou can use whatever way you find more readable or even both at once :)\n\n### Passing arguments to workflows\n\nWorkflows can accept any primitive arguments in their constructor, which then will be available in your\n`configure` method.\n\nLet's assume we are writing a book publishing workflow which needs to know where the PDF of the book is and under what ISBN it will be released:\n\n```ruby\nclass PublishBookWorkflow \u003c Gush::Workflow\n  def configure(url, isbn, publish: false)\n    run FetchBook, params: { url: url }\n    if publish\n      run PublishBook, params: { book_isbn: isbn }, after: FetchBook\n    end\n  end\nend\n```\n\nand then create your workflow with those arguments:\n\n```ruby\nPublishBookWorkflow.create(\"http://url.com/book.pdf\", \"978-0470081204\", publish: true)\n```\n\nand that's basically it for defining workflows, see below on how to define jobs:\n\n## Defining jobs\n\nThe simplest job is a class inheriting from `Gush::Job` and responding to `perform` method. Much like any other ActiveJob class.\n\n```ruby\nclass FetchBook \u003c Gush::Job\n  def perform\n    # do some fetching from remote APIs\n  end\nend\n```\n\nBut what about those params we passed in the previous step?\n\n## Passing parameters into jobs\n\nTo do that, simply provide a `params:` attribute with a hash of parameters you'd like to have available inside the `perform` method of the job.\n\nSo, inside workflow:\n\n```ruby\n(...)\nrun FetchBook, params: {url: \"http://url.com/book.pdf\"}\n(...)\n```\n\nand within the job we can access them like this:\n\n```ruby\nclass FetchBook \u003c Gush::Job\n  def perform\n    # you can access `params` method here, for example:\n\n    params #=\u003e {url: \"http://url.com/book.pdf\"}\n  end\nend\n```\n\n## Executing workflows\n\nNow that we have defined our workflow and its jobs, we can use it:\n\n### 1. Start background worker process\n\n**Important**: The command to start background workers depends on the backend you chose for ActiveJob.\nFor example, in case of Sidekiq this would be:\n\n```\nbundle exec sidekiq -q gush\n```\n\n**[Click here to see backends section in official ActiveJob documentation about configuring backends](http://guides.rubyonrails.org/active_job_basics.html#backends)**\n\n**Hint**: gush uses `gush` queue name by default. Keep that in mind, because some backends (like Sidekiq) will only run jobs from explicitly stated queues.\n\n\n### 2. Create the workflow instance\n\n```ruby\nflow = PublishBookWorkflow.create(\"http://url.com/book.pdf\", \"978-0470081204\")\n```\n\n### 3. Start the workflow\n\n```ruby\nflow.start!\n```\n\nNow Gush will start processing jobs in the background using ActiveJob and your chosen backend.\n\n### 4. Monitor its progress:\n\n```ruby\nflow.reload\nflow.status\n#=\u003e :running|:finished|:failed\n```\n\n`reload` is needed to see the latest status, since workflows are updated asynchronously.\n\n## Loading workflows\n\n### Finding a workflow by id\n\n```\nflow = Workflow.find(id)\n```\n\n### Paging through workflows\n\nTo get workflows with pagination, use start and stop (inclusive) index values:\n\n```\nflows = Workflow.page(0, 99)\n```\n\nOr in reverse order:\n\n```\nflows = Workflow.page(0, 99, order: :desc)\n```\n\n## Advanced features\n\n### Global parameters for jobs\n\nWorkflows can accept a hash of `globals` that are automatically forwarded as parameters to all jobs.\n\nThis is useful to have common functionality across workflow and job classes, such as tracking the creator id for all instances:\n\n```ruby\nclass SimpleWorkflow \u003c Gush::Workflow\n  def configure(url_to_fetch_from)\n    run DownloadJob, params: { url: url_to_fetch_from }\n  end\nend\n\nflow = SimpleWorkflow.create('http://foo.com', globals: { creator_id: 123 })\nflow.globals\n=\u003e {:creator_id=\u003e123}\nflow.jobs.first.params\n=\u003e {:creator_id=\u003e123, :url=\u003e\"http://foo.com\"}\n```\n\n**Note:** job params with the same key as globals will take precedence over the globals.\n\n\n### Pipelining\n\nGush offers a useful tool to pass results of a job to its dependencies, so they can act differently.\n\n**Example:**\n\nLet's assume you have two jobs, `DownloadVideo`, `EncodeVideo`.\nThe latter needs to know where the first one saved the file to be able to open it.\n\n\n```ruby\nclass DownloadVideo \u003c Gush::Job\n  def perform\n    downloader = VideoDownloader.fetch(\"http://youtube.com/?v=someytvideo\")\n\n    output(downloader.file_path)\n  end\nend\n```\n\n`output` method is used to ouput data from the job to all dependant jobs.\n\nNow, since `DownloadVideo` finished and its dependant job `EncodeVideo` started, we can access that payload inside it:\n\n```ruby\nclass EncodeVideo \u003c Gush::Job\n  def perform\n    video_path = payloads.first[:output]\n  end\nend\n```\n\n`payloads` is an array containing outputs from all ancestor jobs. So for our `EncodeVideo` job from above, the array will look like:\n\n\n```ruby\n[\n  {\n    id: \"DownloadVideo-41bfb730-b49f-42ac-a808-156327989294\" # unique id of the ancestor job\n    class: \"DownloadVideo\",\n    output: \"https://s3.amazonaws.com/somebucket/downloaded-file.mp4\" #the payload returned by DownloadVideo job using `output()` method\n  }\n]\n```\n\n**Note:** Keep in mind that payloads can only contain data which **can be serialized as JSON**, because that's how Gush stores them internally.\n\n### Dynamic workflows\n\nThere might be a case when you have to construct the workflow dynamically depending on the input.\n\nAs an example, let's write a workflow which accepts an array of users and has to send an email to each one. Additionally after it sends the e-mail to every user, it also has to notify the admin about finishing.\n\n\n```ruby\n\nclass NotifyWorkflow \u003c Gush::Workflow\n  def configure(user_ids)\n    notification_jobs = user_ids.map do |user_id|\n      run NotificationJob, params: {user_id: user_id}\n    end\n\n    run AdminNotificationJob, after: notification_jobs\n  end\nend\n```\n\nWe can achieve that because `run` method returns the id of the created job, which we can use for chaining dependencies.\n\nNow, when we create the workflow like this:\n\n```ruby\nflow = NotifyWorkflow.create([54, 21, 24, 154, 65]) # 5 user ids as an argument\n```\n\nit will generate a workflow with 5 `NotificationJob`s and one `AdminNotificationJob` which will depend on all of them:\n\n\n```mermaid\ngraph TD\n    A{Start} --\u003e B[NotificationJob]\n    A --\u003e C[NotificationJob]\n    A --\u003e D[NotificationJob]\n    A --\u003e E[NotificationJob]\n    A --\u003e F[NotificationJob]\n    B --\u003e G[AdminNotificationJob]\n    C --\u003e G\n    D --\u003e G\n    E --\u003e G\n    F --\u003e G\n    G --\u003e H{Finish}\n```\n\n### Dynamic queue for jobs\n\nThere might be a case you want to configure different jobs in the workflow using different queues. Based on the above the example, we want to config `AdminNotificationJob` to use queue `admin` and `NotificationJob` use queue `user`.\n\n```ruby\n\nclass NotifyWorkflow \u003c Gush::Workflow\n  def configure(user_ids)\n    notification_jobs = user_ids.map do |user_id|\n      run NotificationJob, params: {user_id: user_id}, queue: 'user'\n    end\n\n    run AdminNotificationJob, after: notification_jobs, queue: 'admin'\n  end\nend\n```\n\n### Dynamic waitable time for jobs\n\nThere might be a case you want to configure a job to be executed after a time. Based on above example, we want to configure `AdminNotificationJob` to be executed after 5 seconds.\n\n```ruby\n\nclass NotifyWorkflow \u003c Gush::Workflow\n  def configure(user_ids)\n    notification_jobs = user_ids.map do |user_id|\n      run NotificationJob, params: {user_id: user_id}, queue: 'user'\n    end\n\n    run AdminNotificationJob, after: notification_jobs, queue: 'admin', wait: 5.seconds\n  end\nend\n```\n\n### Customization of ActiveJob enqueueing\n\nThere might be a case when you want to customize enqueing a job with more than just the above two options (`queue` and `wait`).\n\nTo pass additional options to `ActiveJob.set`, override `Job#worker_options`, e.g.:\n\n```ruby\n\nclass ScheduledJob \u003c Gush::Job\n\n  def worker_options\n    super.merge(wait_until: Time.at(params[:start_at]))\n  end\n\nend\n```\n\nOr to entirely customize the ActiveJob integration, override `Job#enqueue_worker!`, e.g.:\n\n```ruby\n\nclass SynchronousJob \u003c Gush::Job\n\n  def enqueue_worker!(options = {})\n    Gush::Worker.perform_now(workflow_id, name)\n  end\n\nend\n```\n\n\n## Command line interface (CLI)\n\n### Checking status\n\n- of a specific workflow:\n\n  ```\n  bundle exec gush show \u003cworkflow_id\u003e\n  ```\n\n- of a page of workflows:\n\n  ```\n  bundle exec gush list\n  ```\n\n- of the most recent 100 workflows\n\n  ```\n  bundle exec gush list -99 -1\n  ```\n\n### Vizualizing workflows as image\n\nThis requires that you have imagemagick installed on your computer:\n\n\n```\nbundle exec gush viz \u003cNameOfTheWorkflow\u003e\n```\n\n### Customizing locking options\n\nIn order to prevent getting the RedisMutex::LockError error when having a large number of jobs, you can customize these 2 fields `locking_duration` and `polling_interval` as below\n\n```ruby\n# config/initializers/gush.rb\nGush.configure do |config|\n  config.redis_url = \"redis://localhost:6379\"\n  config.concurrency = 5\n  config.locking_duration = 2 # how long you want to wait for the lock to be released, in seconds\n  config.polling_interval = 0.3 # how long the polling interval should be, in seconds\nend\n```\n\n### Cleaning up afterwards\n\nRunning `NotifyWorkflow.create` inserts multiple keys into Redis every time it is run.  This data might be useful for analysis but at a certain point it can be purged.  By default gush and Redis will keep keys forever.  To configure expiration you need to do two things.\n\n1. Create an initializer that specifies `config.ttl` in seconds. Best NOT to set TTL to be too short (like minutes) but about a week in length.\n\n```ruby\n# config/initializers/gush.rb\nGush.configure do |config|\n  config.redis_url = \"redis://localhost:6379\"\n  config.concurrency = 5\n  config.ttl = 3600*24*7\nend\n```\n\n2. Call `Client#expire_workflows` periodically, which will clear all expired stored workflow and job data and indexes. This method can be called at any rate, but ideally should be called at least once for every 1000 workflows created.\n\nIf you need more control over individual workflow expiration, you can call `flow.expire!(ttl)` with a TTL different from the Gush configuration, or with -1 to never expire the workflow.\n\n### Avoid overlapping workflows\n\nSince we do not know how long our workflow execution will take we might want to avoid starting the next scheduled workflow iteration while the current one with same class is still running.  Long term this could be moved into core library, perhaps `Workflow.find_by_class(klass)`\n\n```ruby\n# config/initializers/gush.rb\nGUSH_CLIENT = Gush::Client.new\n# call this method before NotifyWorkflow.create\ndef find_by_class klass\n  GUSH_CLIENT.all_workflows.each do |flow|\n    return true if flow.to_hash[:name] == klass \u0026\u0026 flow.running?\n  end\n  return false\nend\n```\n\n## Gush 3.0 Migration\n\nGush 3.0 adds indexing for fast workflow pagination and changes the mechanism for expiring workflow data from Redis.\n\n### Migration\n\nRun `bundle exec gush migrate` after upgrading. This will update internal data structures.\n\n### Expiration API\n\nPeriodically run `Gush::Client.new.expire_workflows` to expire data. Workflows will be automatically enrolled in this expiration, so there is no longer a need to call `workflow.expire!`.\n\n\n## Contributors\n\n- [Mateusz Lenik](https://github.com/mlen)\n- [Michał Krzyżanowski](https://github.com/krzyzak)\n- [Maciej Nowak](https://github.com/keqi)\n- [Maciej Kołek](https://github.com/ferusinfo)\n\n## Contributing\n\n1. Fork it ( http://github.com/chaps-io/gush/fork )\n2. Create your feature branch (`git checkout -b my-new-feature`)\n3. Commit your changes (`git commit -am 'Add some feature'`)\n4. Push to the branch (`git push origin my-new-feature`)\n5. Create new Pull Request\n","funding_links":[],"categories":["Ruby","Queues and Messaging"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchaps-io%2Fgush","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fchaps-io%2Fgush","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchaps-io%2Fgush/lists"}