{"id":43231845,"url":"https://github.com/restaumatic/taskrunner","last_synced_at":"2026-02-01T10:08:55.839Z","repository":{"id":254708872,"uuid":"844562033","full_name":"restaumatic/taskrunner","owner":"restaumatic","description":null,"archived":false,"fork":false,"pushed_at":"2025-11-06T18:38:08.000Z","size":156,"stargazers_count":1,"open_issues_count":3,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-11-06T19:23:37.832Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Haskell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/restaumatic.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-08-19T14:07:45.000Z","updated_at":"2025-11-06T18:38:19.000Z","dependencies_parsed_at":"2024-09-05T17:05:13.242Z","dependency_job_id":"d192c53b-7f54-450b-a0fe-baad25c7fd20","html_url":"https://github.com/restaumatic/taskrunner","commit_stats":null,"previous_names":["restaumatic/taskrunner"],"tags_count":27,"template":false,"template_full_name":null,"purl":"pkg:github/restaumatic/taskrunner","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/restaumatic%2Ftaskrunner","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/restaumatic%2Ftaskrunner/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/restaumatic%2Ftaskrunner/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/restaumatic%2Ftaskrunner/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/restaumatic","download_url":"https://codeload.github.com/restaumatic/taskrunner/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/restaumatic%2Ftaskrunner/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28975290,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-01T09:57:52.632Z","status":"ssl_error","status_checked_at":"2026-02-01T09:57:49.143Z","response_time":56,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-02-01T10:08:55.772Z","updated_at":"2026-02-01T10:08:55.826Z","avatar_url":"https://github.com/restaumatic.png","language":"Haskell","funding_links":[],"categories":[],"sub_categories":[],"readme":"# taskrunner\n\n## Features\n\n\n- Runs tasks (which are defined as shell scripts)\n- Ensures only one instance of a task runs at a time (globally in the whole system) - using file-based locks\n- Tasks can run in parallel\n- Output from parallel tasks is correctly annotated with original task name\n- Output lines are timestamped\n- Parallelism can be nested\n  - But task name annotations are flattened, i.e. are not added repeatedly at each level\n- Output from each task is also collected to a separate file, to easier debug that particular task\n- Tasks can have specified inputs\n  - Input can be a file, an environment variable, or result of an arbitrary command\n- If a task is invoked again for the same input (on a given machine), it is not re-run\n- Tasks can have specified outputs (files)\n- Task outputs can be cached in an object store\n- If a task is invoked with inputs for which it was already computed on CI, the result is fetched from remote cache and task is not re-run\n- If a task is called multiple times in the same execution, it is only executed once.\n- When running on CI:\n   - Task output (stdout and stderr) is uploaded to an object store after task is finished\n   - task status is reported as a GitHub _check_ for a commit\n     - _pending_ while it's running\n     - _success_ or _failed_ when finished\n   - GitHub check details link points to the uploaded task output\n\n- Unstable inputs (i.e. inputs that have changed during the job execution) are detected\n  - TODO: do something about input-output files like package-lock.json\n\n- Tasks can have command-line arguments\n  - But some options are meant for the task runner, such as `-f`\n\n- Can force reexecution of a specific task (excl. dependencies) (`-f`)\n  - Note: in previous implementation `-f` forced reexecution of all tasks. This seems less useful, will be under another option.\n\n- Task can be cancelled using SIGINT or SIGTERM, and state is maintained appropriately\n\n- Fast - if there's nothing to do, returns quickly (\u003c1s, ideally \u003c300ms)\n\n## \"Prime cache\" mode\n\nWhen migrating from another system behind a flag, it is sometimes desirable to build on the old system but still fill remote cache one the new one. For that occasion, a special \"prime cache mode\" is there.\n\nIt modifies the behavior in the following way:\n- `snapshot` never downloads remote cache (incl. fuzzy) - to avoid overwriting stuff (which we assume is already built via another mechanism)\n- `snapshot` always skips the job\n- remote cache is uploaded, despite job being skipped\n\nTo use it, first build using another system, and the run `taskrunner` with `TASKRUNNER_PRIME_CACHE_MODE=1`\n\n## Directory structure\n\n- `$TASKRUNNER_STATE_DIRECTORY` (default: `/tmp/taskrunner`)\n  - `locks` - global locks per job\n    - `${jobName}.lock` - job lock file, job takes the lock when running\n  - `hash` - hashes of inputs of already-done jobs\n    - `${jobName}.hash` - first line is hash, rest is hash input (for debugging)\n  - `builds` - build state directory for each build. Each toplevel invocation creates a subdirectory here.\n    - `${buildId}` - state dir of a specific build. `buildId` is derived from the invocation time.\n      - `logs` - logs produced by jobs in that build.\n        - `${jobName}.log` - log, without ANSI sequences stripped\n      - `results` - per-build cache of job results (we don't re-run jobs twice inside a build, even without `snapshot`)\n        - `${jobName}` - file with status code of the job\n\n## Configuration variables\n\n- `TASKRUNNER_DEBUG` - whether to output debug messages to toplevel output. Note that debug messages are always written to per-task logs, regardless of this setting.\n- `TASKRUNNER_LOG_INFO` - whether to output \"info\" messages to toplevel output. They are minimal messages, produced only when there's actually something to be done (including fetching from cache).\n- more...\n\n## Possible features\n\n- Support stdin? For now redirected from `/dev/null`\n- Quiet output like `gradle`, only report what is running and progress, not full output, and no output if nothing to do\n- Marker files - additional hash file in `.stack-work`, `node_modules` etc., so that if that dir is cleared, we redo the action\n  - Or: remember hash of some of the output files and check they're still there\n  - Only some because there can be benign changes\n- Can dump enough info to reproduce failures\n  - For example: hashes of inputs, caches etc.\n- Generate a trace (otlp for analysis, or render to a gantt chart)\n\n## Things we should do better than previous version\n\n- less confusing output for cache miss (no \"error\")\n- ??? Something, can't recall now\n- `--cmd` replaced with `--raw`, since we can't really execute in the context of the original script\n\n## Things to handle\n\n- Task leaks stdin/out/err handle - have a timeout on draining output\n- Parallel task failed and we're killed - report status correctly\n- `snapshot` - how to communicate with controller process?\n  - pipe and pass fd to child process?\n  - named pipe and pass name to child process via env?\n- Nested tasks - each should write to original stdout\n- Unmerged files when hashing\n- bad usage of `snapshot` - e.g. called twice\n- why `ls-tree -r` is needed - git option of quoting\n- save cache tar error\n\n## Misc TODO\n\n- Better output of error messages (to normal streams)\n- String/Text unification\n- Debugging - show hash input\n- More specialized tests for input handling\n- In parent task's log, add reference to nested log file\n- Debugging aid: when replacing saved hash, show diff between old and new hash input (or save old hash input to compare)\n- Bug: pendolino sometimes rebuilds randomly with scripts/UPDATE\n  - Probable cause: helper generation races with its input hashing\n    - nope, it generates in another directory\n- More tests for interaction between remote cache and local hash, especially:\n  - restoring remote cache should also store local hash, but not store remote cache again\n- test for root dir != cwd\n- test for commit status\n- Somehow test content-type in log upload?\n\n## Output principles (output generated by runner itself)\n\n- \"quiet\" operation - no output except when an error happens\n- standard operation (\"info\" mode):\n  - when job does nothing (already done locally), no output\n  - when resuming from cache, output one line for start (so that we know something's happening), one for done\n  - when running, output one line for start, one for done - only for snapshottable jobs\n- debug: log everything (maybe later categorize)\n\n## Performance goals\n\n- Previous impl no-op tests/scripts/UPDATE: ~1.6s\n- Current impl no-op tests/scripts/UPDATE: ~2.3s\n\n##  Snapshot Command Flags\n\nThe `snapshot` command supports the following flags:\n\n- `--outputs`: Specifies files to be cached in remote cache.\n- `--cache-success`: Use remote cache even when no outputs are specified. The task is not rerun if it succeeded previously with the same inputs. Useful e.g. for test suites.\n- `--raw`: Specifies raw input strings that are used to compute the task's hash.\n- `--fuzzy-cache`: Enables the use of a fuzzy cache, which attempts to restore from a cache of a similar task if the exact cache is not available.\n- `--cache-root`: Specifies the root directory for caching. Use when caching things outside of the repository, e.g. `~/.stack`.\n- `--cache-version`: Specifies a version string for the cache. `--fuzzy-cache` will not download cache from another version, allowing clean breaks when making big changes, e.g. upgrading a compiler.\n- `--commit-status`: Enables reporting of the task's status to a commit status system, such as GitHub checks.\n- `--long-running`: Indicates that the task is expected to run for a long time (e.g. a server). Currently doens't have any effect though, TODO: can we remove it?\n\n\n## Testing\n\nThis project uses [tasty-golden](https://github.com/UnkindPartition/tasty-golden) for snapshot-based testing.\n\n### Running Tests\n\n```bash\n# Run all tests (auto-detects S3 credentials)\nstack test\n\n# Run tests, skipping slow ones for faster development\nexport SKIP_SLOW_TESTS=1\nstack test\n\n# Run specific test by pattern\nstack test --test-arguments \"--pattern hello\"\n\n# List all available tests\nstack test --test-arguments \"--list-tests\"\n```\n\n### Test Structure\n\nTests are located in `test/t/` directory with two files per test:\n- `.txt` file - Shell script to execute\n- `.out` file - Expected output (golden file)\n\n#### Test Directives\n\nSpecial comments in `.txt` files control test behavior:\n- `# check output` - Check stdout/stderr (default)\n- `# check github` - Check GitHub API calls\n- `# no toplevel` - Don't wrap in taskrunner\n- `# s3` - Requires S3 credentials (auto-skipped if missing)\n- `# github keys` - Provide GitHub credentials\n- `# quiet` - Run in quiet mode\n\n### S3 Test Auto-Detection\n\n15 tests require S3 credentials and are automatically skipped if credentials are missing.\n\nTo run S3 tests, set these environment variables:\n```bash\nexport TASKRUNNER_TEST_S3_ENDPOINT=your-s3-endpoint\nexport TASKRUNNER_TEST_S3_ACCESS_KEY=your-access-key\nexport TASKRUNNER_TEST_S3_SECRET_KEY=your-secret-key\nstack test\n```\n\n### Accepting Golden Test Changes\n\nWhen golden tests fail due to expected output changes:\n\n```bash\nstack test --test-arguments --accept\n```\n\nThis updates the `.out` files with new expected output. Review changes carefully before committing.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frestaumatic%2Ftaskrunner","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frestaumatic%2Ftaskrunner","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frestaumatic%2Ftaskrunner/lists"}