{"id":16533957,"url":"https://github.com/martindisch/shepherd","last_synced_at":"2025-07-25T14:08:54.965Z","repository":{"id":44030286,"uuid":"222896074","full_name":"martindisch/shepherd","owner":"martindisch","description":"A distributed video encoder that splits files into chunks for multiple machines","archived":false,"fork":false,"pushed_at":"2022-09-25T11:20:20.000Z","size":145,"stargazers_count":28,"open_issues_count":1,"forks_count":5,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-07-23T09:35:24.857Z","etag":null,"topics":["chunks","distributed","ffmpeg","parallel","split","video-encoder"],"latest_commit_sha":null,"homepage":null,"language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/martindisch.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE-3RD-PARTY","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-11-20T09:09:49.000Z","updated_at":"2025-07-23T01:18:51.000Z","dependencies_parsed_at":"2022-08-29T14:52:23.401Z","dependency_job_id":null,"html_url":"https://github.com/martindisch/shepherd","commit_stats":null,"previous_names":[],"tags_count":6,"template":false,"template_full_name":null,"purl":"pkg:github/martindisch/shepherd","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/martindisch%2Fshepherd","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/martindisch%2Fshepherd/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/martindisch%2Fshepherd/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/martindisch%2Fshepherd/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/martindisch","download_url":"https://codeload.github.com/martindisch/shepherd/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/martindisch%2Fshepherd/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":267017144,"owners_count":24021968,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-25T02:00:09.625Z","response_time":70,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chunks","distributed","ffmpeg","parallel","split","video-encoder"],"created_at":"2024-10-11T18:16:17.008Z","updated_at":"2025-07-25T14:08:54.948Z","avatar_url":"https://github.com/martindisch.png","language":"Rust","readme":"# shepherd\n\n[![Latest version](https://img.shields.io/crates/v/shepherd)](https://crates.io/crates/shepherd)\n[![Documentation](https://docs.rs/shepherd/badge.svg)](https://docs.rs/shepherd)\n[![License](https://img.shields.io/crates/l/shepherd)](https://github.com/martindisch/shepherd#license)\n\n\u003c!-- cargo-sync-readme start --\u003e\n\nA distributed video encoder that splits files into chunks to encode them on\nmultiple machines in parallel.\n\n## Installation\n\nUsing Cargo, you can do\n```console\n$ cargo install shepherd\n```\nor just clone the repository and compile the binary with\n```console\n$ git clone https://github.com/martindisch/shepherd\n$ cd shepherd\n$ cargo build --release\n```\nThere's also a\n[direct download](https://github.com/martindisch/shepherd/releases/latest/download/shepherd)\nfor the latest x86-64 ELF binary.\n\n## Usage\n\nThe prerequisites are one or more (you'll want more) computers—which we'll\nrefer to as hosts—with `ffmpeg` installed and configured such that you can\nSSH into them directly. This means you'll have to `ssh-copy-id` your public\nkey to them. I only tested it on Linux, but if you manage to set up\n`ffmpeg` and SSH, it might work on macOS or Windows directly or with little\nmodification.\n\nThe usage is pretty straightforward:\n```text\nUSAGE:\n    shepherd [FLAGS] [OPTIONS] \u003cIN\u003e \u003cOUT\u003e --clients \u003chostnames\u003e [FFMPEG OPTIONS]...\n\nFLAGS:\n    -h, --help       Prints help information\n    -k, --keep       Don't clean up temporary files\n    -V, --version    Prints version information\n\nOPTIONS:\n    -c, --clients \u003chostnames\u003e    Comma-separated list of encoding hosts\n    -l, --length \u003cseconds\u003e       The length of video chunks in seconds\n    -t, --tmp \u003cpath\u003e             The path to the local temporary directory\n\nARGS:\n    \u003cIN\u003e                   The original video file\n    \u003cOUT\u003e                  The output video file\n    \u003cFFMPEG OPTIONS\u003e...    Options/flags for ffmpeg encoding of chunks. The\n                           chunks are video only, so don't pass in anything\n                           concerning audio. Input/output file names are added\n                           by the application, so there is no need for that\n                           either. This is the last positional argument and\n                           needs to be preceeded by double hypens (--) as in:\n                           shepherd -c c1,c2 in.mp4 out.mp4 -- -c:v libx264\n                           -crf 26 -preset veryslow -profile:v high -level 4.2\n                           -pix_fmt yuv420p\n                           This is also the default that is used if no options\n                           are provided.\n```\n\nSo if we have three machines c1, c2 and c3, we could do\n```console\n$ shepherd -c c1,c2,c3 -l 30 source_file.mp4 output_file.mp4\n```\nto have it split the video in roughly 30 second chunks and encode them in\nparallel. By default it encodes in H.264 with a CRF value of 26 and the\n`veryslow` preset. If you want to supply your own `ffmpeg` options for more\ncontrol over the codec, you can do so by adding them to the end of the\ninvocation:\n```console\n$ shepherd -c c1,c2 input.mkv output.mp4 -- -c:v libx264 -crf 40\n```\n\n## How it works\n\n1. Creates a temporary directory in your home directory.\n2. Extracts the audio and encodes it. This is not parallelized, but the\n   time this takes is negligible compared to the video anyway.\n3. Splits the video into chunks. This can take relatively long, since\n   you're basically writing the full file to disk again. It would be nice\n   if we could read chunks of the file and directly transfer them to the\n   hosts, but that might be tricky with `ffmpeg`.\n4. Spawns a manager and an encoder thread for every host. The manager\n   creates a temporary directory in the home directory of the remote and\n   makes sure that the encoder always has something to encode. It will\n   transfer a chunk, give it to the encoder to work on and meanwhile\n   transfer another chunk, so the encoder can start directly with that once\n   it's done, without wasting any time. But it will keep at most one chunk\n   in reserve, to prevent the case where a slow machine takes too many\n   chunks and is the only one still encoding while the faster ones are\n   already done.\n5. When an encoder is done and there are no more chunks to work on, it will\n   quit and the manager transfers the encoded chunks back before\n   terminating itself.\n6. Once all encoded chunks have arrived, they're concatenated and the audio\n   stream added.\n7. All remote and the local temporary directory are removed.\n\nThanks to the work stealing method of distribution, having some hosts that\nare significantly slower than others does not delay the overall operation.\nIn the worst case, the slowest machine is the last to start encoding a\nchunk and remains the only working encoder for the duration it takes to\nencode this one chunk. This window can easily be reduced by using smaller\nchunks.\n\n## Performance\n\nAs with all things parallel, Amdahl's law rears its ugly head and you don't\njust get twice the speed with twice the processing power. With this\napproach, you pay for having to split the video into chunks before you\nbegin, transferring them to the encoders and the results back, and\nreassembling them. Although I should clarify that transferring the chunks\nto the encoders only causes a noticeable delay until every encoder has its\nfirst chunk, the subsequent ones can be sent while the encoders are working\nso they don't waste time waiting for that. And returning and assembling the\nencoded chunks doesn't carry too big of a penalty, since we're dealing with\nmuch more compressed data then.\n\nTo get a better understanding of the tradeoffs, I did some testing with a\ncouple of computers I had access to. They were my main, pretty capable\ndesktop, two older ones and a laptop. To figure out how capable each of\nthem is so we can compare the actual to the expected speedup, I let each of\nthem encode a relatively short clip of slightly less than 4 minutes taken\nfrom the real video I want to encode, using the same settings I'd use for\nthe real job. And if you're wondering why encoding takes so long, it's\nbecause I'm using the `veryslow` preset for maximum efficiency, even though\nit's definitely not worth the huge increase in encoding time. But it's a\nnice simulation for how it would look if we were using an even more\ndemanding codec like AV1.\n\n| machine   | duration (s) | power    |\n| --------- | ------------ | -------- |\n| desktop   | 1373         | 1.000    |\n| old1      | 2571         | 0.53     |\n| old2      | 3292         | 0.42     |\n| laptop    | 5572         | 0.25     |\n| **total** | -            | **2.20** |\n\nBy giving my desktop the \"power\" level 1, we can determine how powerful the\nothers are at this encoding task, based on how long it takes them in\ncomparison. By adding the three other, less capable machines to the mix, we\nslightly more than double the theoretical encoding capability of our\nsystem.\n\nI determined these power levels on a short clip, because encoding the full\nvideo would have taken very long on the less capable ones, especially the\nlaptop. But I still needed to encode the full thing on at least one of them\nto make the comparison to the distributed encoding. I did that on my\ndesktop since it's the fastest one, and to additionally verify that the\npower levels hold up for the full video, I bit the bullet and did the same\non the second most powerful machine.\n\n| machine | duration (s)  | power |\n| ------- | ------------- | ----- |\n| desktop | 9356          | 1.00  |\n| old1    | 17690         | 0.53  |\n\nNow we have the baseline we want to beat with parallel encoding, as well as\nconfirmation that the power levels are valid for the full video. Let's see\nhow much of the theoretical, but unreachable 2.2x speedup we can get.\n\nEncoding the video in parallel took 5283 seconds, so 56.5% of the time\nusing my fastest computer, or a 1.77x speedup. We committed about twice the\ncomputing power and we're not too far off that two times speedup. It's\nmaking use of the additionally available resources with an 80% efficiency\nin this case. I also tried to encode the short clip in parallel, which was\nvery fast, but had a somewhat disappointing speedup of only 1.32x. I\nsuspect that we get better results with longer videos, since encoding a\nchunk always takes longer than creating and transferring it (otherwise\ndistributing wouldn't make sense at all). The longer the video then, the\nlarger the ratio of encoding (which we can parallelize) in the total amount\nof time the process takes, and the more effective doing so becomes.\n\nI've also looked at how the work is distributed over the nodes, depending\non their processing power. At the end of a parallel encode, it's possible\nto determine how many chunks have been encoded by any given host.\n\n| host          | chunks | power |\n| ------------- | ------ | ----- |\n| desktop       | 73     | 1.00  |\n| old1          | 39     | 0.53  |\n| old2          | 31     | 0.42  |\n| laptop        | 19     | 0.26  |\n\nInferring the processing power from the number of chunks leads to almost\nexactly the same results as my initial determination, confirming it and\nproving that work is distributed efficiently.\n\nTo further see how the system scales, I've added two more machines,\nbringing the total processing power up to 3.29.\n\n| machine   | duration (s) | power    |\n| --------- | ------------ | -------- |\n| desktop   | 1373         | 1.00     |\n| c1        | 2129         | 0.64     |\n| old1      | 2571         | 0.53     |\n| c2        | 3022         | 0.45     |\n| old2      | 3292         | 0.42     |\n| laptop    | 5572         | 0.25     |\n| **total** | -            | **3.29** |\n\nEncoding the video on these 6 machines in parallel took 3865 seconds, so\n41.3% of the time using my fastest computer, or a 2.42x speedup. It's\nmaking use of the additionally available resources with a 74% efficiency\nhere. As expected, while we can accelerate by adding more resources, we're\nlooking at diminishing returns. Although the factor by which the efficiency\ndecreases is not as bad as it could be.\n\n## Limitations\n\nWhile you can use your own `ffmpeg` options to control how the video is\nencoded, there is currently no such option for the audio, which is 192 kb/s\nAAC by default.\n\n\u003c!-- cargo-sync-readme end --\u003e\n\n## License\nLicensed under either of\n\n * [Apache License, Version 2.0](LICENSE-APACHE)\n * [MIT license](LICENSE-MIT)\n\nat your option.\n\n### Contribution\n\nUnless you explicitly state otherwise, any contribution intentionally submitted\nfor inclusion in the work by you, as defined in the Apache-2.0 license, shall\nbe dual licensed as above, without any additional terms or conditions.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmartindisch%2Fshepherd","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmartindisch%2Fshepherd","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmartindisch%2Fshepherd/lists"}