{"id":13879463,"url":"https://github.com/umbrellio/sequel-batches","last_synced_at":"2025-04-09T21:20:36.917Z","repository":{"id":30563019,"uuid":"125372492","full_name":"umbrellio/sequel-batches","owner":"umbrellio","description":"Sequel extension for querying large datasets in batches","archived":false,"fork":false,"pushed_at":"2024-12-10T15:27:59.000Z","size":47,"stargazers_count":18,"open_issues_count":0,"forks_count":2,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-04-02T19:07:51.842Z","etag":null,"topics":["batches","sequel"],"latest_commit_sha":null,"homepage":"","language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/umbrellio.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-03-15T13:37:37.000Z","updated_at":"2024-12-10T15:26:53.000Z","dependencies_parsed_at":"2024-12-18T17:11:04.758Z","dependency_job_id":"43f2da37-3242-47a7-83c5-afa7ea175107","html_url":"https://github.com/umbrellio/sequel-batches","commit_stats":null,"previous_names":[],"tags_count":9,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/umbrellio%2Fsequel-batches","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/umbrellio%2Fsequel-batches/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/umbrellio%2Fsequel-batches/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/umbrellio%2Fsequel-batches/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/umbrellio","download_url":"https://codeload.github.com/umbrellio/sequel-batches/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248112360,"owners_count":21049645,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["batches","sequel"],"created_at":"2024-08-06T08:02:21.873Z","updated_at":"2025-04-09T21:20:36.893Z","avatar_url":"https://github.com/umbrellio.png","language":"Ruby","funding_links":[],"categories":["Ruby"],"sub_categories":[],"readme":"# Sequel::Batches    [![Gem Version](https://badge.fury.io/rb/sequel-batches.svg)](https://badge.fury.io/rb/sequel-batches) [![Build Status](https://travis-ci.org/umbrellio/sequel-batches.svg?branch=master)](https://travis-ci.org/umbrellio/sequel-batches) [![Coverage Status](https://coveralls.io/repos/github/umbrellio/sequel-batches/badge.svg?branch=master)](https://coveralls.io/github/umbrellio/sequel-batches?branch=master)\n\nThis dataset extension provides the `#in_batches` method. The method splits dataset in parts and yields it.\n\nNote: currently only PostgreSQL database is supported.\n\n## Installation\n\nAdd this line to your application's Gemfile:\n\n```ruby\ngem 'sequel-batches'\n```\n\n## Usage\n\nIn order to use the feature you should enable the extension:\n\n```ruby\nDB.extension(:batches)\n```\n\nAfter that the `#in_batches` method becomes available on dataset:\n\n```ruby\nUser.where(role: \"admin\").in_batches(of: 4) do |ds|\n  ds.delete\nend\n```\n\nFinally, here's an example including all the available options:\n\n```ruby\noptions = {\n  of: 4,\n  pk: [:project_id, :external_user_id],\n  start: { project_id: 2, external_user_id: 3 },\n  finish: { project_id: 5, external_user_id: 70 },\n  order: :desc,\n}\n\nEvent.where(type: \"login\").in_batches(**options) do |dataset|\n  dataset.delete\nend\n```\n\n## Options\n\nYou can set the following options:\n\n### pk\n\nOverrides primary key of your dataset. This option is required in case your table doesn't have a real PK, otherwise you will get `Sequel::Extensions::Batches::MissingPKError`.\n\nNote that you have to provide columns that don't contain NULL values, otherwise this may not work as intended. You will receive `Sequel::Extensions::Batches::NullPKError` in case batch processing detects a NULL value on it's way, but it's not guaranteed since it doesn't check all the rows for performance reasons.\n\n### of\n\nSets chunk size (1000 by default).\n\n### start\n\nA hash `{ [column]: \u003cstart_value\u003e }` that represents frame start for batch processing. Note that you will get `Sequel::Extensions::Batches::InvalidPKError` in case you provide a hash with wrong keys (ordering matters as well).\n\n### finish\n\nSame as `start` but represents the frame end.\n\n### order\n\nSpecifies the primary key order (can be `:asc` or `:desc`). Defaults to `:asc`.\n\n## Contributing\n\nBug reports and pull requests are welcome on GitHub at https://github.com/umbrellio/sequel-batches.\n\n## License\n\nThe gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).\n\n\u003ca href=\"https://github.com/umbrellio/\"\u003e\n\u003cimg style=\"float: left;\" src=\"https://umbrellio.github.io/Umbrellio/supported_by_umbrellio.svg\" alt=\"Supported by Umbrellio\" width=\"439\" height=\"72\"\u003e\n\u003c/a\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fumbrellio%2Fsequel-batches","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fumbrellio%2Fsequel-batches","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fumbrellio%2Fsequel-batches/lists"}