{"id":21035440,"url":"https://github.com/archiveteam/seesaw-kit","last_synced_at":"2025-06-17T23:06:57.797Z","repository":{"id":3557575,"uuid":"4618826","full_name":"ArchiveTeam/seesaw-kit","owner":"ArchiveTeam","description":"Making a reusable toolkit for writing seesaw scripts","archived":false,"fork":false,"pushed_at":"2023-05-22T07:04:25.000Z","size":785,"stargazers_count":70,"open_issues_count":68,"forks_count":31,"subscribers_count":9,"default_branch":"development","last_synced_at":"2025-06-12T09:37:26.558Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ArchiveTeam.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2012-06-10T22:10:25.000Z","updated_at":"2025-02-03T16:04:58.000Z","dependencies_parsed_at":"2023-02-14T12:02:15.107Z","dependency_job_id":null,"html_url":"https://github.com/ArchiveTeam/seesaw-kit","commit_stats":{"total_commits":357,"total_committers":17,"mean_commits":21.0,"dds":"0.45658263305322133","last_synced_commit":"699b0d215768c2208b5b48844c9f0f75bd6a1cbc"},"previous_names":[],"tags_count":41,"template":false,"template_full_name":null,"purl":"pkg:github/ArchiveTeam/seesaw-kit","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ArchiveTeam%2Fseesaw-kit","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ArchiveTeam%2Fseesaw-kit/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ArchiveTeam%2Fseesaw-kit/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ArchiveTeam%2Fseesaw-kit/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ArchiveTeam","download_url":"https://codeload.github.com/ArchiveTeam/seesaw-kit/tar.gz/refs/heads/development","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ArchiveTeam%2Fseesaw-kit/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":260453743,"owners_count":23011576,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-19T13:14:55.997Z","updated_at":"2025-06-17T23:06:57.742Z","avatar_url":"https://github.com/ArchiveTeam.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"Seesaw toolkit\n==============\n\nAn asynchronous toolkit for distributed web processing. Written in Python and named after its behavior, it supports concurrent downloads, uploads, etc.\n\nThis toolkit is well-known for [Archive Team projects](http://archiveteam.org). It also powers the [Archive Team warrior](http://archiveteam.org/index.php?title=Warrior).\n\n[![Build Status](https://secure.travis-ci.org/ArchiveTeam/seesaw-kit.png)](http://travis-ci.org/ArchiveTeam/seesaw-kit)\n[![Coverage Status](https://coveralls.io/repos/ArchiveTeam/seesaw-kit/badge.svg)](https://coveralls.io/r/ArchiveTeam/seesaw-kit)\n\nInstallation\n------------\n\nRequires Python 2 or 3.\n\nNeeds the Tornado library for event-driven I/O. The complete list of Python modules needed are listed in requirements.txt.\n\n\nHow to try it out\n-----------------\n\nTo run the example pipeline:\n\n    sudo pip install -r requirements.txt\n    ./run-pipeline --help\n    ./run-pipeline examples/example-pipeline.py someone\n\nPoint your browser to `http://127.0.0.1:8001/`.\n\nYou can also use `run-pipeline2` or `run-pipeline3` to be explicit for the Python version.\n\n\nOverview\n--------\n\nGeneral idea: a set of `Task`s that can be combined into a `Pipeline` that processes `Item`s:\n\n* An `Item` is a thing that needs to be downloaded (a user, for example). It has properties that are filled by the `Task`s.\n* A `Task` is a step in the download process: it takes an item, does something with it and passes it on. Example Tasks: getting an item name from the tracker, running a download script, rsyncing the result, notifying the tracker that it's done.\n* A `Pipeline` represents a sequence of `Task`s. To make a seesaw script for a new project you'd specify a new `Pipeline`.\n\nA `Task` can work on multiple `Item`s at a time (e.g., multiple Wget downloads). The concurrency can be limited by wrapping the task in a `LimitConcurrency` `Task`: this will queue the items and run them one-by-one (e.g., a single Rsync upload).\n\nThe `Pipeline` needs to be fed empty `Item` objects; by controlling the number of active `Item`s you can limit the number of items. (For example, add a new item each time an item leaves the pipeline.)\n\nWith the `ItemValue`, `ItemInterpolation` and `ConfigValue` classes it is possible to pass item-specific arguments to the `Task` objects. The value of these objects will be re-evaluated for each item. Examples: a path name that depends on the item name, a configurable bandwidth limit, the number of concurrent downloads.\n\nConsult [the wiki](https://github.com/ArchiveTeam/seesaw-kit/wiki) for more information.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Farchiveteam%2Fseesaw-kit","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Farchiveteam%2Fseesaw-kit","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Farchiveteam%2Fseesaw-kit/lists"}