{"id":20838882,"url":"https://github.com/andrewradev/progressor","last_synced_at":"2025-05-08T21:29:33.627Z","repository":{"id":56888894,"uuid":"171631472","full_name":"AndrewRadev/progressor","owner":"AndrewRadev","description":"Measure iterations in a long-running task","archived":false,"fork":false,"pushed_at":"2024-08-04T09:54:08.000Z","size":30,"stargazers_count":6,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-05-08T21:29:33.226Z","etag":null,"topics":["benchmarking","estimation","measurement","progress","ruby"],"latest_commit_sha":null,"homepage":"https://rubygems.org/gems/progressor","language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/AndrewRadev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-02-20T08:25:29.000Z","updated_at":"2025-04-23T12:52:45.000Z","dependencies_parsed_at":"2024-11-18T01:47:34.147Z","dependency_job_id":null,"html_url":"https://github.com/AndrewRadev/progressor","commit_stats":{"total_commits":29,"total_committers":1,"mean_commits":29.0,"dds":0.0,"last_synced_commit":"a7d30c2cae3a2501caa88ef253b41cce588db6a2"},"previous_names":[],"tags_count":5,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AndrewRadev%2Fprogressor","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AndrewRadev%2Fprogressor/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AndrewRadev%2Fprogressor/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AndrewRadev%2Fprogressor/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/AndrewRadev","download_url":"https://codeload.github.com/AndrewRadev/progressor/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253153044,"owners_count":21862298,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["benchmarking","estimation","measurement","progress","ruby"],"created_at":"2024-11-18T01:11:51.369Z","updated_at":"2025-05-08T21:29:33.608Z","avatar_url":"https://github.com/AndrewRadev.png","language":"Ruby","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Build Status](https://travis-ci.org/AndrewRadev/progressor.svg?branch=master)](https://travis-ci.org/AndrewRadev/progressor)\n[![Gem Version](https://badge.fury.io/rb/progressor.svg)](https://badge.fury.io/rb/progressor)\n\nFull documentation for the latest released version can be found at: https://www.rubydoc.info/gems/progressor\n\n## Basic example\n\nHere's an example long-running task:\n\n``` ruby\nProduct.find_each do |product|\n  next if product.not_something_we_want_to_process?\n  product.calculate_interesting_stats\nend\n```\n\nIn order to understand how it's progressing, we might add some print statements:\n\n``` ruby\nProduct.find_each do |product|\n  if product.not_something_we_want_to_process?\n    puts \"Skipping product: #{product.id}\"\n    next\n  end\n\n  puts \"Working on product: #{product.id}\"\n  product.calculate_interesting_stats\nend\n```\n\nThis gives us some indication of progress, but no idea how much time is left. We could take a count and maintain a manual index, and then eyeball it based on how fast the numbers are adding up. Progressor automates that process:\n\n``` ruby\nprogressor = Progressor.new(total_count: Product.count)\n\nProduct.find_each do |product|\n  if product.not_something_we_want_to_process?\n    progressor.skip(1)\n    next\n  end\n\n  progressor.run do |progress|\n    puts \"[#{progress}] Product #{product.id}\"\n    product.calculate_interesting_stats\n  end\nend\n```\n\nEach invocation of `run` measures how long its block took and records it. The yielded `progress` parameter is an object that can be `to_s`-ed to provide progress information.\n\nThe output might look like this:\n\n```\n...\n[0038/1000, (004%), t/i: 0.5s, ETA: 8m:00s] Product 38\n[0039/1000, (004%), t/i: 0.5s, ETA: 7m:58s] Product 39\n[0040/1000, (004%), t/i: 0.5s, ETA: 7m:57s] Product 40\n...\n```\n\nYou can check the documentation for the [Progressor](https://www.rubydoc.info/gems/progressor/Progressor) class for details on the methods you can call to get the individual pieces of data shown in the report.\n\n## Limited and unlimited sequences\n\nInitializing a `Progressor` with a provided `total_count:` parameter gives you a limited sequence, which can give you not only a progress report, but an estimation of when it'll be done:\n\n```\n[\u003ccurrent loop\u003e/\u003ctotal count\u003e, (\u003cprogress\u003e%), t/i: \u003ctime per iteration\u003e, ETA: \u003ctime until it's done\u003e]\n```\n\nThe calculation is done by maintaining a list of measurements with a limited size, and a list of averages of those measurements. The average of averages is the \"time per iteration\" and it's multiplied by the remaining count to produce the estimation.\n\nI can't really say how reliable this is, but it seems to provide smoothly changing estimations that seem more or less correct to me, for similarly-sized chunks of work per iteration.\n\n**Not** providing a `total_count:` parameter leads to less available information:\n\n``` ruby\nprogressor = Progressor.new\n\n(1..100).each do |i|\n  progressor.run do |progress|\n    sleep rand\n    puts progress\n  end\nend\n```\n\nA sample of output might look like this:\n\n```\n...\n11, t: 5.32s, t/i: 442.39ms\n12, t: 5.58s, t/i: 446.11ms\n...\n```\n\nThe format is:\n\n```\n\u003ccurrent\u003e, t: \u003ctime from start\u003e, t/i: \u003ctime per iteration\u003e\n```\n\n## Simpler iteration\n\nFor `ActiveRecord` and other iterable collections, it's possible to skip some boilerplate. For example, you might start from this:\n\n``` ruby\nRecord.not_processed.find_each do |record|\n  record.process\nend\n```\n\nIn order to add measurements, you could instantiate a `Progressor` etc like in the above example, or you could do this:\n\n``` ruby\nProgressor::Iteration.find_each(Record.not_processed) do |record|\n  record.process\nend\n```\n\nAnd it'll add some default print statements. Check the API documentation or the code for details.\n\n## Configuration\n\nApart from `total_count`, which is optional and affects the kind of sequence that will be stored, you can provide `min_samples` and `max_samples`. You can also provide a custom formatter:\n\n``` ruby\nprogressor = Progressor.new({\n  total_count: 1000,\n  min_samples: 5,\n  max_samples: 10,\n  formatter: -\u003e (p) { p.eta }\n})\n```\n\nThe option `min_samples` determines how many loops the tool will wait until trying to produce an estimation. A higher number means no information in the beginning, but no wild fluctuations, either. It needs to be at least 1 and the default is 1.\n\nThe option `max_samples` is how many measurements will be retained. Those measurements will be averaged, and then those averages averaged to get a time-per-iteration estimate. A smaller number means giving more weight to later events, while a larger one would average over a larger amount of samples. The default is 100.\n\nThe `formatter` is a callback that gets a progress object as an argument and you can return your own string to output on every loop. Check `LimitedSequence` and `UnlimitedSequence` for the available methods and accessors you can use.\n\n## Related work\n\nA very similar tool is the gem [ke](https://github.com/mkdynamic/ke). It provides its estimation by maintaining the median quartile range of the stored measurements, removing outliers. It also automates the output of the progress report, only printing it every N loops. Depending on your needs and preferences, it might be better for your use case.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fandrewradev%2Fprogressor","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fandrewradev%2Fprogressor","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fandrewradev%2Fprogressor/lists"}