Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/binaryphile/concorde

Bash scripting in my own particular...[sigh] "Idiom, sir?" Idiom!
https://github.com/binaryphile/concorde

bash

Last synced: 23 days ago
JSON representation

Bash scripting in my own particular...[sigh] "Idiom, sir?" Idiom!

Awesome Lists containing this project

README

        

Message For You, Sir [![Build Status](https://travis-ci.org/binaryphile/concorde.svg?branch=master)](https://travis-ci.org/binaryphile/concorde)
====================

Bash scripting in my own particular...\[sigh\]...

Concorde: "Idiom, sir?"

Idiom!

Concorde is a toolkit for writing bash scripts and libraries.

Features
========

- an [enhanced-getopt style] option parser: `parse_options`

- [array] and [hash] utility functions (hashes, a.k.a. "[associative
arrays][hash]")

- smarter versions of `source`, a.k.a. the `.` operator: `require` and
`require_relative`

- support for test frameworks, such as [shpec]: `sourced`

- automatic ruby-style tracebacks on errors: [`strict_mode`] with
tracebacks (but no change to `IFS`)

- [namespaces] to isolate library variables from one another

- python-style [selective importation] of functions from libraries:
`bring`

- [keyword arguments] for functions

- command [macros] which avoid common pitfalls with system commands

Requirements
============

- [GNU `readlink`] on your PATH - for Mac users, [`greadlink`] is also
acceptable

- `sed` on your PATH

- bash 4.3 or 4.4 - tested with:

- 4.3.11

- 4.3.33

- 4.3.42

- 4.4.12

Reserved Global Variables
=========================

Concorde reserves a few global variables for its own use. They begin
with `__` (double-underscore)

- `__` - double-underscore itself

- `__ns` - short for "namespace"

- `__errmsg` - an error message passed by `raise`

Any script or library used with concorde cannot change the purpose of
these variables.

Installation
============

Clone or download this repository, then put its `lib` directory in your
PATH, or copy `lib/concorde.bash` into a PATH directory.

Use `source concorde.bash` in your scripts.

Usage
=====

Consult the API specification below for full details.

A Sample Script Template
------------------------

``` bash
#!/usr/bin/env bash

source concorde.bash

get <<'EOS'
Usage: script [options] ...

Options:
-o , --option= a value to pass into the script
-f a flag that is true when given
EOS
printf -v usage '\n%s\n' "$__"

script_main () {
$(grab 'option_var f_flag' from "$1") # make locals of the options
shift # make ready to process args

do_something_with "$option_var" # use the option value
(( f_flag )) && do_something_with_flag # test if -f was supplied

# process the positional arguments
while (( $# )); do # true while there are args
case $1 in
alternative_1 ) do_alternative_1 ;;
alternative_2 ) do_alternative_2 ;;
* ) $(raise "Error: unknown argument '$1'") ;;
esac
shift # move to next argument
done
}

[other functions...]

sourced && return # stop here when testing the script
strict_mode on # stop on errors and issue a traceback

# define command-line options
# short long var name help
# ----- ---- -------- ----
get <<'EOS'
-o --option option_var "a value to pass into the script"
-f '' '' "a flag that is true when given"
EOS

! (( $# )) && die "$usage"
$(parse_options __ "$@") || die "$usage"
script_main __ "$@" || die "$usage"
```

Read the rest of the usage section for a full explanation of the
features used above, or look at the [tutorial] for a walkthrough which
develops a script from the ground up.

Functions Which Return Boolean Values
-------------------------------------

Functions used for their truth value are typically used in expressions
in order to trigger actions.

For example the `sourced` function typically is used like so:

``` bash
sourced && return
```

These functions use the normal bash return code mechanism where `0` is
success and any other value is failure.

Functions Which Return Strings
------------------------------

Bash's typical mechanism for storing strings generated by a function is
to use [command substitution].

For example, the result of an `echo` command might be stored like so:

``` bash
# this is not how concorde returns strings
my_value=$(echo "the value")
```

Concorde doesn't use this method as it is prone to capturing unexpected
output and also requires an unnecessary subshell.

Any concorde function which returns a string value does so in the global
variable `__` (double-underscore).

Because any function is allowed to overwrite `__` at any time, you want
to save that value before calling any other functions like so:

``` bash
get <<<"the value"
my_value=$__
```

`get` is a concorde function which stores a string from `stdin` and
`<<<` feeds it the supplied string.

`__` must be treated much the same as the `$?` return code, since every
successive command may change it.

Note that because `__` is a global, it is discarded by the subshells
which are employed by pipelines. Therefore you cannot use pipelines to
return strings from concorde functions. For example, this will not work:

``` bash
# doesn't work
echo "the value" | get
my_value=$__
```

Because `__`'s value is ephemeral, it can be used to hold interim values
and feed the output of one operation to the next:

``` bash
get <<<"the value"
my_function_that_returns_a_string "$__"
final_value=$__
```

Note that `__` is always a string value. Your functions should be
careful not to store an actual array or hash in it, for example:

``` bash
# don't do this
__=( "array item" )
```

This is because some of concorde's features rely on `__`'s type to be
string. Since bash automatically converts a string variable to an array
or hash when assigned, doing so can interfere with concorde.

Dealing with Hashes and Arrays as Parameters
--------------------------------------------

Bash can pass string values to functions, but is not able to pass arrays
nor hashes as individual parameters to a function.

If an array needs to be treated as a parameter to a function, typical
bash practice is to either pass the expanded array as multiple
arguments, or to use the shortcut of not passing it at all and instead
just refer to the global variable itself.

Another approach is to use named references ([`declare -n`] or
[`${!reference}`]) instead of using a normal local variable.

For a variety of reasons, each of these approaches is problematic.

The workaround employed by concorde is to convert arrays and hashes to
strings (serialize them) when crossing function boundaries, whether as
arguments or return values. This gives you full control of your variable
namespace since you aren't using outer-scope variables.

And while bash is not good at passing arrays (hashes especially), it is
good at passing strings, so why not use that.

By the same token, concorde's functions are written to expect the string
representations of arrays and hashes, when those argument types are
called for. While there are a couple of concorde functions which
actually do operate on real (non-string) arrays/hashes, that is clearly
noted in the API documentation for them.

Although bash doesn't have a general-purpose string literal
representation for an array, it does define such a format in its [array
assignment] statements. You can see an example by running
`declare -p `.

Concorde borrows the same format for the array literals expected by
concorde's functions, with minor changes.

### Passing an Array or Hash

For example, to call a function `my_function` which expects a single
array argument, you might define the array, then use concorde's `repr`
function to generate the string format:

``` bash
my_ary=( "first item" "second item" )
repr my_ary
my_function "$__"
```

Note that `repr` takes the name of the array as an argument and returns
the string representation in `__`.

The same method works for a hash.

### Receiving an Array

To write a function which receives such an argument, you use concorde's
`local_ary` function:

``` bash
my_func () {
$(local_ary input_ary=$1)
local item

for item in "${input_ary[@]}"; do
echo "$item"
done
}
```

`ary` is short for "array".

`local_ary` creates a local array variable, in this case `input_ary`,
and gives it the contents provided in `$1`. For the rest of the function
you use `input_ary` like a normal array, because it is one.

Note that the `$()` command substitution operator around `local_ary` is
necessary. Without it, `local_ary` can't create a local variable in the
scope of the caller.

To receive a hash instead of an array, simply use the `local_hsh`
function instead of `local_ary`.

### Passing Arrays/Hashes by Name

Both `local_ary` and `local_hsh` will allow you to pass them the name of
the variable holding the array representation instead of the
representation itself. They will detect the variable name and expand it.
In general, you should pass variable names to them instead of
[expansions] wherever possible.

Let's look at an example. The following lines prepare an array
representation in `__`:

``` bash
array=( "item one" )
repr array
```

Concorde's `member_of` function takes an array representation, along
with an array item we're looking for, and returns a boolean indicating
whether the item was found in the array. Instead of using the array
expansion `$__`, you can give it the name of the array variable instead
(`__`):

``` bash
member_of __ "item one" && put "'item one' is in the array"
```

Concorde supports passing by variable name for array and hash
representations, but not for regular string variables. You still have to
use expansions to pass regular strings:

``` bash
value="item one"
# passing "value" doesn't expand it to "item one", so doesn't work:
member_of __ value
```

### Just Passing Through

Of course, if your function only needs to receive an array/hash in order
to pass it to another function, you don't need to convert the string
representation into its actual array form, you can simply receive and
pass the string representation (note the call by variable name):

``` bash
my_function () {
local array_representation=$1

another_function array_representation
}
```

### A Caveat

The recommended way to use `local_ary` and `local_hsh` (and functions
that employ them) is to always pass array parameters by name.

The caveat introduced by the pass-by-name functionality is that if you
pass an array which happens to contain only one item, and that one item
is the name of a variable, it will be mistaken for a variable holding an
array representation itself and expanded, when that is not what you
intended.

This is not a problem for hashes, only arrays.

Be careful to avoid this situation or you will get unexpected behavior.
The recommended way to avoid it is to always pass array representations
by variable name. If you do pass a literal, however, ensure that it is
not a single-item array that is also the name of a variable.

### Passing by Literal

You may also construct your own literals for arrays or hashes, but the
two each follow their own, slightly different, rule.

#### Arrays (Not Hashes)

The array syntax consists of whitespace-separated items. Whitespace
includes spaces, tabs and newlines; the normal values in the field
separator variable `IFS`.

Individual array items which contain whitespace must either be quoted or
escaped. Here is a comparison of regular [array assignment][array
assignment] and the equivalent literals used by concorde for both quoted
and escaped forms:

``` bash
# actual arrays and equivalent representations
array1=( 'an item' 'another item' )
representation1="'an item' 'another item'"

array2=( an\ item another\ item )
representation2="an\ item another\ item"
```

Either form shown above, quoted or escaped, is acceptable.

Notice that the representations above are simply the string form of what
appears between the parentheses in array declarations. In fact, an array
representation should be usable in the statement:

``` bash
eval "array=( $representation )"
```

For the most part, an array representation is equivalent to the portion
inside the parentheses of `declare -p`'s output, minus the bracketed
indices.

`repr` returns the escaped form, rather than quoted, and without
indices. Therefore concorde can't preserve the indexing of [sparse
arrays], since those require preservation of indices.

The following are both examples of equivalent array literals:

``` bash
# newlines separating items (items containing spaces require quotes or will be split)
my_literal='
one
two
"three and four"
'

another_literal='one two "three and four"'
```

### Hashes

Hashes, like arrays, are similar to the portion inside the parentheses
of `declare -p`'s output. Unlike arrays, however, hash literals must
include indices. Unlike the regular form of hash declarations though,
concorde's indices are not in brackets. They are more like keyword
parameters in other languages. For example:

``` bash
my_literal="one=1 two=2 three_and_four='3 and 4'"
```

In this case, quoted items are quoted after the index and equals sign
(as in `'3 and 4'`). Escaping works as well.

`repr` generates this format when invoked on a hash.

Notably, the following does *not* work on a hash representation:

``` bash
# does NOT work
eval "declare -A hash=( $representation )"
```

That's because of the missing brackets on indices.

Because the indices do not have brackets, concorde also doesn't support
hash indices with spaces. In general, concorde only supports hash
indices which are also usable as variable names. That is, keys which are
composed only of alphanumeric and underscore characters, and don't start
with a number.

### Passing Arrays as Multiple Arguments

`local_ary` is also geared to accept multiple arguments as an array.
This can be useful when converting positional arguments (`$@`) into a
named array:

``` bash
my_function () {
$(local_ary my_ary="$@")
local item

for item in "${my_ary[@]}"; do
do_something_with "$item"
done
}
```

### Passing Hashes as Multiple Arguments (a.k.a. [Keyword Arguments][keyword arguments])

`local_hsh` can do the same thing with multiple arguments:

``` bash
my_function () {
$(local_hsh my_hsh="$@")
local key

for key in "${!my_hsh[@]}"; do
do_something_with "${my_hsh[$key]}"
done
}
```

Calling a function like this looks familiar from other languages:

``` bash
my_function one=1 two=2 three_and_four="3 and 4"
```

Languages such as python and ruby allow you to specify named arguments
via keywords like the above.

Required (non-keyword) arguments must always be passed before keyword
arguments, as positional arguments. Optional arguments may then be
passed last as keyword arguments.

Optional arguments have their default values defined by the function.

Here is an example of how such a function is implemented:

``` bash
my_function () {
local required_arg=$1; shift
local optional_arg="default value"
$(grab optional_arg from "$@")

do_something_with "$required_arg"
do_something_with "$optional_arg"
}
```

Any required arguments are stored and `shift`ed out of the positional
arguments.

Then the optional values are `grab`bed by name from the residual
arguments, which must all be keywords at that point. Grab just passes
them to `local_hsh` internally to create a true hash from them, then
extracts `optional_arg` from the hash into a local variable. More on
`grab` later.

This is what it looks like calling `my_function`:

``` bash
my_function "required value" optional_arg="optional value"
```

`optional_arg=...` can be left off, in which case the function will use
its default value.

### Newline-delimited Array Literals, or Nested Arrays

You can construct an array representation with another array nested
inside fairly easily, but it requires a different type of array
representation on the outside.

Let's start with a function which expects a nested array as its only
argument:

``` bash
my_function () {
$(local_nry outer_ary=$1)
local item
local row

for row in "${outer_ary[@]}"; do
$(local_ary inner_ary=$row)
for item in "${inner_ary[@]}"; do
echo "$item"
done
done
}
```

You've seen `local_ary` so far, but `local_nry` is new.

`local_nry` introduces the idea of a newline-delimited array
representation. Like `local_ary`, it creates a local array (named
`outer_ary`), but expects a slightly different input than `local_ary`
would. `local_nry` expects a multiline array literal, separated only by
newlines, not spaces or tabs such as `local_ary`.

In fact, there are two differences between the two functions. One is
that `local_ary` separates items on tabs and spaces in addition to
newlines, while `local_nry` only separates on newlines. The other is
that `local_nry` escapes all of the items in each row, so they can be
passed unchanged to `local_ary` when you call it.

That means each row of the newline-array representation can contain a
standard array representation, so long as they don't contain newlines,
since `local_nry` parses those.

If the inner arrays need to hold newlines, the newlines must appear in
an [ANSI C-like string]. Normal quotes won't suffice.

For example: `$'a multiline\nstring value'` is an ANSI C-like string
which has a protected newline in it. The newline will not be parsed as a
separator by `local_nry`, but *will* then be turned into a regular
newline by the call to `local_ary`.

The function above creates the outer array from the newline-delimited
representation, then interprets each row as a regular array
representation. That makes a nested array.

Here's how you would call such a function (note that it is `get`ting a
quoted [heredoc]):

``` bash
get <<'EOS'
"first array, item one" $'first array\nitem two with newline'
"second array, item one" "second array, item two"
EOS
my_func __
```

Its output would be:

``` bash
first array, item one
first array
item two with newline
second array, item one
second array, item two
```

If using an unquoted [heredoc] (no quotes around our `EOS` tag), the
dollar-sign needs to be escaped to delay expansion:

``` bash
get <