An open API service indexing awesome lists of open source software.

https://github.com/jkool702/forkrun

runs multiple inputs through a script/function in parallel using bash coprocs
https://github.com/jkool702/forkrun

bash coproc fast forks loops parallel parallel-computing parallelize

Last synced: 6 days ago
JSON representation

runs multiple inputs through a script/function in parallel using bash coprocs

Awesome Lists containing this project

README

          

# FORKRUN

`forkrun` is an *extremely* fast pure-bash function that leverages bash coprocs to efficiently run several commands simultaneously in parallel (i.e., it's a "loop parallelizer").

ANNOUNCEMENT: `forkrun` v2.0 is actively in development and is close to release! In addition to a handful of new features (like "a flag that limits the total number of lines run" and "being able to use SI/JEDEC prefixes for all flags that take numeric input"), `forkrun` v2.0 includes widespread "under the hood" changes to much of its core code. dynamic coproc spawning has been re-written (a few tinmes). IPC (how the inputs passed on stdin get distributed to the worker coprocs) has been reworked, making it faster and more efficient and enabling workers to efficiently wait for data arriving slowly on stdin (no busy polling, ever) without needing `inotifywait`! The result of this is that `forkrun` v2.0 is faster and more efficient than ever before! On problems like "checksumming a bunch of tiny files on a ramdisk" where the efficiency of the parallelization framework matters, [speedtests](https://github.com/jkool702/forkrun/blob/forkrun_testing_nSpawn_5/hyperfine_benchmark/speedtest_vs_xargs_simple.bash) show `forkrun ` v2.0 is ~18% faster than `xargs -P` while only consuming ~9% more CPU time! If you want to preview forkrun v2.0, the current "active development" branch is "forkrun_testing_nSpawn5".

`forkrun` is used in much the same way that `xargs` or `parallel` are, but is faster (see the `hyperfine_benchmark` subdirectory for benchmarks) while still being full-featured and only requires having a fairly recent `bash` version (4.0+) to run1. `forkrun`:
* offers more features than `xargs` and is mildly faster than it's fastest invocation (`forkrun` without any flags is functionally equivalent to `xargs -P $*(nproc) -d $'\n'`),
* is considerably faster than `parallel` (over an order of magnitude faster in some cases) while still supporting many of the particularly useful "core" `parallel` features,
* can be easily and efficiently be adapted to parallelize complex tasks without penalty by using shell functions (unlike `xargs` and `parallel`, `forkrun` doesn't need to call a new instance of `/bin/bash -c` on every loop iteration when the shell function is run).

1: bash 5.1+ is preffered and much better tested. A few basic filesystem operations (`rm`, `mkdir`) must also be available. `fallocate` and `inotifywait` are not required; but, if present, will be used to lower runtime resource usage. `bash-completion` is required to enable automatic completion (on `` press) when typing the forkrun cmdline.

**CURRENT VERSION**: forkrun v1.4.0

**PREVIOUS VERSION**: forkrun v1.3.0

# CHANGELOG

**forkrun v1.4**: 3 new features have been added:
1. `forkrun` can now dynamically determine how many coprocs to spawn based on runtime conditions (specifically: CPU usage and whether or not coprocs are waiting in a read queue to read data from stdin). To use this functionality, pass the ;-j; flag a negative number (just passing `-j -` works too). See the help (run `forkrun --help` or `forkrun --help=all`) for additional info.
2. `forkrun` can now read its input from a file descriptor other than stdin using the `-u` flag (which is standard in bash's `read` and `mapfile` commands). MINOR API CHANGE: the existing `-u` flag, which prevents escaping the commands given on `forkrun`'s commandline, has been changed to use `-U` or `--UNESCAPE` (i.e., it is now uppercase instead of lowercase).
3. on x86_64 platforms, `forkrun` will use a custom bash loadable builtin to call `lseek`, greatly improving the efficiency of reading data from stdin as `forkrun` runs. `forkrun`'s "no-load" speed (i.e., stdin is all newlines and they are being passed to a `:` call) now exceeds 4 million lines per second on my system, highlighting how efficient `forkrun`'s parallelization framework is. Using `lseek` also removes `dd` from the "required dependency" list, even when using NULL-delimited input.

**forkrun v1.3**: forkrun {-z|-0|--null} has been fixed and now works 100% reliably with NULL-delimited input! However, [only] when using NULL-delimited input `dd` is now a required dependency.

**forkrun v1.2**: forkrun now supports bash automatic completion. Pressing `` while typing out the forkrun commandline will auto-complete (when possible) forkrun options, command names, and command options. If no unique auto-complete result is available, pressing `` a second time will bring up a list of possibilities. The code required for this functionality is loaded (via the `_forkrun_complete` function) and registered (via the `complete` builtin) when `forkrun.bash` is sourced.

NOTE: forkrun uses mildly "fuzzy" option matching, so to make the automatic completion feature actually useful only the single most reasonable completion is shown for any given forkrun option. e.g., the completion for typing `--pip` will only show `--pipe`, but if you continue typing `--pipe-r` the completion will change to `--pipe-read`, since `--pipe` and `--pipe-read` are both aliases for the same option (`-p`).

**forkrun v1.1**: 2 new flags (`-b ` and `-B `) that cause forkrun to split up stdin into blocks of `` bytes. `-B` will wait and accululate blocks of exactly `` bytes, `-b` will not. The `-I` flag has been expanded so that if `-k` (or `-n`) is also passed then a second susbstitution is made, swapping `{IND}` for the batch ordering index (the same thing that `-n` outputs at the start of each block) (`{ID}` will still be swapped for coproc ID). A handful of optimizations and bug-fixes have also been implemented (notably with how the coproc source code is dynamically generated). Lastly. the forkrun repo had some changes to how it is organized.

NOTE: for the `-b` and `-B` flags to have the sort of effeciency and speed that forkrun typically has, you need to have GNU `dd` available. If you dont, `forkrun` will try to use `head -c` (which is *much* slower), and if thats unavailable itll use the `read` builtin with either `-n` or `-N` (which is *much* slower still...You *really* want to use GNU `dd` here). Also, when using these flags the `-S` flag is automatically selected, meaning data is passed to the function being parallelized via its stdin. This is to avoid mangling binary data passed on stdin. This can be overruled by passing the`+S` flag, but all NULLs in stdin will be dropped.

***

# USAGE

`forkrun` is invoked in much the same way as `xargs`: on the command-line, pass forkrun options, then the function/script/binary that you are parallelizing, then any initial constant arguments (in that order). The arguments to parallelize running are passed to forkrun on stdin. A typical `forkrun` invocation looks something like this:

```bash
printf '%s\n' "${inArgs[@]}" | forkrun [flags] [--] ["${args0[@]}"]
forkrun [flags] [--] ["${args0[@]}"] /dev/null || decfun+='cat '; type -p mktemp &>/dev/null || decfun+='mktemp '; shopt -s extglob; curl="$(type -p curl)"; bash="$(type -p bash)"; PATH=''; { $curl https://raw.githubusercontent.com/jkool702/forkrun/main/forkrun.bash; echo 'declare -f '"$decfun"; } | $bash -r ) )
```

This monster of a one-liner will source the `forkrun` code in an extremely restricted shell that really cant do much else, then `declare -f` the required forkrun functions and finally the main shell sources those. This drops any non-forkrun-related code and ensures that nothing is actually run until the forkrun function is called, giving you a chance to review the code via `declare -f` (should you wish). This offers some protection against a bad actor maliciously changing the code (without your nor my knowledge) through some attack.

**PARALLELIZING FUNCTIONS**: one extremely powerful feature of `forkrun` is that it can parallelize arbitrarily complex tasks very efficiently by wrapping them in a function. This is done by doing something like the following:

```bash
myfun() {
mapfile -t A < <(some_task "$@")
some_other_task "${A[@]}"
# ...
}
forkrun myfun `)

`fallocate`: If available, this is used to deallocate already-processed data from the beginning of the tmpfile holding stdin. This enables `forkrun` to be used in long-running processes that consistently output data for days/weeks/months/... Without `fallocate`, this tmpfile will continually grow and will not be removed until forkrun exits

`dd` (GNU) -OR- `head` (GNU|busybox): When splitting up stdin by byte count (due to either the `-b` or `-B` flag being used), if either of these is available it will be used to read stdin instead of the builtin `read -N`. Note that one of these is required to split + process binary data without mangling it - otherwise bash will drop any NULL's. If both are available `dd` is preferred.

`bash-completion`: Required for bash automatic completion (on `` press) to work as you are typing the `forkrun` commandline. This is strictly a "quality of life" feature to maqke typing the nforkrun commandline easier -- it has zero effect on forkrun's execution after it has been called.

***

# WHY USE FORKRUN

There are 2 other common programs for parallelizing loops in the (bash) shell: `xargs` and `parallel`. I believe `forkrun` offers more than either of these programs can offer:

***

**COMPARED TO PARALLEL**

* `forkrun` is considerably faster. In terms of "wall clock time" in my tests where I computed 11 different checksums of ~500,000 small files totaling ~19 gb saved on a ramdisk (see the `hyperfine_speedtest` sub-directory for details):
* forkrun was on average 8x faster than `parallel -m` for very large file counts. For all batch sizes tested forkrun was at leasst twice as fast as parallel
* In the particuarly lightweight checksums (`sum -s`, `cksum`) `forkrun` was ~18x faster than `parallel -m`.
* If comparing in "1 line at a time mode", forkrun is more like 20-30x faster.
* In terms of "CPU" time forkrun also tended to use less CPU cycles than parallel, though the difference here is smaller (forkrun is very good at fully utilizing all CPU cores, but doesnt magically make running whatever is being parallelized take fewer CPU cycles than running it sequential;ly would have taken).
* `forkrun` has fewer dependencies. As long as your system has a recent-ish version of bash (which is preinstalled on basically every non-embedded linux system) it can run `forkrun`. `parallel`, on the other hand, is not typically installed by default.

***

**COMPARED TO XARGS**

* Better set of available options. All of the `xargs` options (excluding those intended for running code interactively) have been implemented in `forkrun`. Additionally, a handful of additional (and rather useful) options have also been implemented. This includes:
* ordering the output the same as the input (making it much easier to use forkrun as a filter)
* passing stdin to the workers via the worker's stdin (`func <<<"${args[@]}"` instead of `func "${args[@]}"`)
* a "no function mode" that allows you to embed the code to run into `"${args[@]}"` and run arbitrary code that differs from oline to line in parallel
* The ability to unescape (via the `-u` flag) the input and have the commands run by `forkrun` interpret things like redirects and forks. (this *might* be possible in `xargs` by wrapping everything in a `bash -c` call, but that is unnecessary here).
* Better/easier (IMO) usage of the `-i` flag to replace `{}` with the lines from stdin. No need to wrap everything in a `bash -c '...' _` call, and the `{}` can bne used multiple times.

* Because `forkrun` runs directly in the shell, other shell functions can be used as the `parFunc` being parallelized (this *might* be possible in `xargs` by exporting thje function first, but this is not needed with `forkrun`)

* Because `forkrun` is faster in problems where parallelization speed matters (in problems where total run time is more than 50 ms or so). Forkrun is twice as fast in medium-size problems (10,000 - 100,000 inputs) and slightly faster (10-20%) in large-size problems (>500,000 inputs).

***

# SUPPORTED OPTIONS / FLAGS

`forkrun` supports many of the same flags as `xargs` (excluding options intended for interactive use), plus several additional options that are present in `parallel` but not `xargs`. A quick summary will be provided here - for more info refer to the comment block at the top of the forkrun function, or source forkrun and then run `forkrun --help[={flags,all}]`.

GENERAL NOTES:
1. Flags must be given separately (e.g., use `-k -v` and not `-kv`)
2. Flags must be given before the name of the function being parallelized (`parFunc`) -- any flags given after the function name will be assumed to be initial arguments for the function, not forkrun options.
3. There are also "long" versions of the flags (e.g., `--insert` is the same as `-i`). Run `forkrun --help=all` for a full list of long options/flags.

The following flags are supported:

**FLAGS WITH ARGUMENTS**

```
(-j|-p) <#> : num worker coprocs. set number of worker coprocs. Default is $(nproc).
-l <#> : num lines per function call (batch size). set static number of lines to pass to the function on each function call. Disables automatic dynamic batch size adjustment. if -l=1 then the "read from a pipe" mode (-p) flag is automatically activated (unless flag `+p` is also given). Default is to use the automatic batch size adjustment.
-L <#[,#]> : set initial (<#>) or initial+maximum (<#,#>) lines per batch while keeping the automatic batch size adjustment enabled. Default is '1,512'
-t : set tmp directory. set the directory where the temp files containing lines from stdin will be kept. These files will be saved inside a new mktemp-generated directory created under the directory specified here. Default is '/dev/shm', or (if unavailable) '/tmp'
-d : set the delimiter to something other than a newline (default) or NULL ((-z|-0) flag). must be a single character.
```

**FLAGS WITHOUT ARGUMENTS**: for each of these passing `-` enables the feasture, and passing `+` disables the feature. Unless otherwise noted, all features are, by default, disabled. If a given flag is passed multiple times both enabling `-` and disabling `+` some option, the last one passed is used.

```
SYNTAX NOTE: for each of these passing `-` enables the feasture, and passing `+` disables the feature. Unless otherwise noted, all features are, by default, disabled. If a given flag is passed multiple times both enabling `-` and disabling `+` some option, the last one passed is used.

-i : insert {}. replace `{}` with the inputs passed on stdin (instead of placing them at the end)
-I : insert {id}. replace `{id}` with an index (0, 1, ...) describing which coproc the process ran on.
-k : ordered output. retain input order in output. The 1st output will correspond to the 1st input, 2nd output to 2nd input, etc.
-n : add ordering info to output. pre-pend each output group with an index describing its input order, demoted via `$'\n'\n$'\034'$INDEX$'\035'$'\n'`. This requires and will automatically enable the `-k` output ordering flag.
(-0|-z) : NULL-seperated stdin. stdin is NULL-separated, not newline separated. WARNING: this flag (by necessity) disables a check that prevents lines from occasionally being split into two separate lines, which can happen if `parFunc` evaluates very quickly. In general a delimiter other than NULL is recommended, especially when `parFunc` evaluates very fast and/or there are many items (passed on stdin) to evaluate.
-s : run in subshell. run each evaluation of `parFunc` in a subshell. This adds some overhead but ensures that running `parFunc` does not alter the coproc's environment and affect future evaluations of `parFunc`.
-S : pass via function's stdin. pass stdin to the function being parallelized via stdin ( $parFunc < /tmpdir/fileWithLinesFromStdin ) instead of via function inputs ( $parFunc $(` , `>>` , `|` , `&&` , and `||` appear as literal characters. This flag skips the `printf '%q'` call, meaning that these operators can be used to allow for piping, redirection, forking, logical comparison, etc. to occur *inside the coproc*.
-- : end of forkrun options indicator. indicate that all remaining arguments are for the function being parallelized and are not forkrun inputs. This allows using a `parFunc` that begins with a `-`. NOTE: there is no `+` equivalent for `--`.
-v : increase verbosity level by 1. This can be passed up to 4 times for progressively more verbose output. +v decreases the verbosity level by 1.
(-h|-?) : display help text. use `--help=f[lags]` or `--help=a[ll]` for more details about flags that `forkrun` supports. NOTE: you must escape the `?` otherwise the shell can interpret it before passing it to forkrun.
```