Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/birchb1024/goyamp

The Macro-processor for YAML and JSON
https://github.com/birchb1024/goyamp

arm-templates automation devops-tools json kubernetes-manifests macro-processor template-language yaml yaml-configuration

Last synced: about 2 months ago
JSON representation

The Macro-processor for YAML and JSON

Awesome Lists containing this project

README

        

= Goyamp - The Go Macro Processor for YAML and JSON
Peter Birch
@@@VERSION@@@, @@@DATE@@@
:toc: macro
:toclevels: 4
Goyamp is a general-purpose macroprocessor for YAML files. Both its input and output are YAML. It scans the input for symbols and makes substitutions and expansions on the output. Goyamp is 100% YAML so the syntax for defining and calling macros is YAML also. It can also process JSON in and out.

== TL;DR

.Input
[source, YAML]
----
- defmacro:
name: foo
args: [who]
value:
Hello: who
- foo:
who: World
----

.Output
[source, YAML]
----
- Hello: World
----

== Get Started with a Pre-compiled Binary

GoYamp is a Go program contained in the goyamp module. This has been compiled and uploaded to Github and dockerhub.

=== Download

You can download a binary copy of the Goyamp program. Copies are available on GitHub in https://github.com/birchb1024/goyamp/releases[Releases] in the repository. For example

[source, bash]
----
$ curl -L -o goyamp https://github.com/birchb1024/goyamp/releases/download/@@@VERSION@@@/goyamp
$ chmod a+x goyamp
----

=== Installation

Simply place the binary in your $PATH or update your path to include the directory where you placed it.

=== Execution

The program is run from the command-line giving the input file to parse as the first argument followed by optional arguments to the expansion. The expansion is written to the standard output, which you normally redirect to another file.:

.Usage
[source,bash]
----
$ goyamp [-d|-debug] [-h|-help] [-o|-output yaml|json|lines] [Filename | - ] [arg1..argn]
----

The `-debug` option causes Goyamp to trace its internal execution on stderr and print a backtrace on errors.

Put your first Goyamp file in `hello.yaml`:

.hello.yaml
[source, YAML]
----
"Hello {{env.USER}} from Goyamp v{{__VERSION__}}"
----

And run it

[source, bash]
----
$ goyamp hello.yaml
----

=== Working with JSON

JSON is a superset of YAML, so goyamp can read JSON files too. You could define variables and macros in JSON, but most people prefer to code in YAML because it's easier
to read. If you want to output JSON, specify the `-output json` command option.

We use this for generating Azure DevOps pipeline definitions - edit in YAML and generate JSON. You could use this for Azure ARM files or any other JSON.

However remember that YAML is a super-set of JSON, so you can express things in YAML which are not valid JSON. For example `{ 1: Monday, 2: Tuesday }`
has integer map keys and is valid YAML but JSON accepts only strings in map keys. Hence goyamp automatically converts map keys to strings when outputting JSON. Example:

.hello.yaml
[source, YAML]
----
$ echo '[{ null : Monday, 2: Tuesday }, null]' | ./goyamp -o json
[
{
"2": "Tuesday",
"null": "Monday"
},
null
]
----

JSON only accepts a single top-level object. If you have multiple YAML docs in your expanded code, goyamp will output each one in turn, which is not strictly valid.

== Running from Docker

You just want to use the docker image.
[source, bash]
----
$ docker run --rm -u $(id -u):$(id -g) -v "$PWD":/work docker.io/birchb1024/goyamp /work/hello.yaml
----

== An example - Pipelines as Code.

Supposing we are building some https://github.com/tomzo/gocd-yaml-config-plugin[GoCD] pipeline definitions in YAML each of which uses the same Git repository. The YAML we have to write looks like this:

.output.yaml
[source,YAML]
----
pipelines:
mypipe1:
group: mygroup
label_template: ${COUNT}
materials: # <1>
mygit:
branch: master
git: http://my.example.org/mygit.git
stages: null
mypipe2:
group: mygroup
label_template: ${COUNT}
materials: # <1>
mygit:
branch: ci
git: http://my.example.org/mygit.git
stages: null
----
<1> Duplicated

We don't want re-key duplicated code so we define a macro which Goyamp expands whenever it is invoked. Our source code now looks like this:

.YAML source
[source,YAML]
----
define: # <1>
name: mygit_repo_url
value: http://my.example.org/mygit.git

defmacro: # <2>
name: mygit_materials
args: [branch_name]
value:
mygit:
git: mygit_repo_url # <3>
branch: branch_name
---
pipelines:
mypipe1:
group: mygroup
label_template: "${COUNT}"
materials: {mygit_materials: {branch_name: master}} # <4>
stages:

mypipe2:
group: mygroup
label_template: "${COUNT}"
materials:
mygit_materials:
branch_name: ci # <5>
stages:
----
<1> simple variable definition
<2> a macro Definition
<3> variable used
<4> a macro call - flow style
<5> a macro call - block style

When run through Goyamp, the output is as above. Now we have a single place where the git repository is defined, if we need to change it we can change it once.

== More Examples

The source repository has a directory of examples which you can run to observe the behaviour of the features. They are located in https://github.com/birchb1024/goyamp[the Github goyamp repository]. You can clone the soure repo to download them or browse them https://github.com/birchb1024/goyamp/tree/master/examples[here].

== Applications

This program is general-purpose, it can be used wherever YAML is required. Its first uses were for GoCd pipelines and Ansible playbooks. These are human-readable source code which is a subset of YAML. Hence Goyamp may not be applied to all aspects of YAML especially those which result from data transmission. We will not be attempting to exercise Goyamp with such inputs.

Since YAML is a superset of JSON it can also be used to generate JSON for, say, Azure ARM files.

== Similar Tools

Yamp - This is the progenitor of Goyamp, a Python YAML macro processor. Goyamp and Yamp are compatible, however there are some differences due to their respective execution environments. Being a Python program itself, Yamp can call Python functions directly.

There are many great general-purpose macro-processors available, starting with the venerable `GPM`, through `m4`, cpp, and lately, Jinja2. However these are predominantly character-based and the programmer has to compute the indentation required by YAML by counting spaces. Like previous authors we started on this course of writing yet another macro-processor primarily for reasons of laziness. Since Goyamp transforms maps and sequences not character strings, indentation is automatic.

== Reference

This section describes the operation of the processor and the macros available.

=== The Command Line

The command to run Goyamp is a single binary executable filename followed by optional arguments. Assuming that `goyamp` is in the `$PATH`:

.Usage
[source,bash]
----
$ goyamp [-d|-debug] [-h|-help] [-o|-output yaml|json|lines] [Filename | - ] [arg1..argn]
----

If the filename is the minus sign `-` or if there are no arguments, Goamp reads YAML from the standard input, so it serves as a filter. As in

[source,bash]
----
$ echo "[define: {data: {load: test/fixtures/blade-runner.json}}, data.directory]" | goyamp
- ' Ridley Scott'
----

If the -output option is given, this specifies the output format required. The default is YAML. When `json` is selected , JSON is output subject to the constraints mentioned above. When `lines` is selected, any top level list is printed with no surrounding syntax. Top-level map objects are printed in JSON format on one line. 'line' mode suits downstream Unix programs which expect simple lines, we use it to generate `bash` scripts or data for `awk`.

==== File Suffixes

Any file suffix can be used - it is assumed to be YAML/JSON.

In practice `yaml` or `json` sufffixes will be recognised by most text editors editing modes. You will need to configure your text editor if you use a non-standard suffix.

==== Docker

A docker image is provided in docker.io (Docker Hub) https://cloud.docker.com/repository/docker/birchb1024/goyamp[here]. This image uses a slim Debian base. To use it you need to map your workspace into the container and use your current user id. In general:

[source, bash]
----
$ docker run --rm -u $(id -u):$(id -g) -v "$HOME":/work docker.io/birchb1024/goyamp [options] /work/{path to your code}.yaml [arg1, arg2...] > outputfile.yaml
----

=== Processing

When Goyamp starts, it collects the command-line arguments and assigns the list to the variable `argv`. It collects the process environment and assigns it to the map variable `env`. Goyamp then reads the input file, attempts to parse the YAML and holds the resulting data as objects in memory. (If the YAML does not parse Goyamp exits). It recursively scans the objects looking for strings which are the same as defined variables or which contain variables inside the string in curly braces. If it finds a match, it substitutes the object with the variable's value.

Goyamp is a substitution engine. It looks for things in its input an when it sees them replaces them with the substitution. The things to look for and the substitutions we call variables and bindings. For example:

.Variables Bindings
[options="header,footer",width="50%"]
|=======================
|Variable Name|Value to substitute
|mygit_repo_url

a|
[source,YAML]
----
http://my.example.org/mygit.git
----

|mygit_materials

a|
[source,YAML]
----
args: [branch_name]
mygit:
git: mygit_repo_url
branch: branch_name
----

|=======================

When scanning maps, Goyamp does not expand map keys unless either the map key is explicitly identified as a variable with the `^` caret character, or the map key is a string with embedded curly braces. In these two special cases Goyamp looks up variables or interpolates the string.

Some special variables contain 'macros' - these must be within a map of their own, with a value containing a map of arguments which can contain anything. Normally a macro will contain more than the original, so we call this 'macro expansion' footnote:[But it could actually be a reduction!].

Goyamp is looking for macro calls with this structure:

[source,YAML]
----
:
:
:
. . .
----

Some macros have special functions and are built-in to Goyamp. Those are described in the reference section.

Here's examples of three kinds of things Goyamp is scanning for replacement:

.Simple Variables
[source,YAML]
----
- Username
- 'directory'
----

.Embeded Variables
[source,YAML]
----
- 'The username is {{Username}}'
----

.Macro Calls
[source,YAML]
----
- add_user:
name: Kevin
phone: (555) 098 880
----

When all the objects in the data have been scanned and in some cases, substituted, Goyamp outputs the new object tree on the standard output
in YAML or JSON format. Because YAML maps are unordered, the order of the keys and their corresponding values on output maybe be different from
the input footnote:[Order-preservation may happen in a future version, but it's complicated].

When the processor sees a null item in an input sequence, these are preserved, however if the empty value is the result of a `define:`, `defmacro:` or other expansion which produces empty values, the value is stripped from the output.

=== Variables

During processing goyamp maintains a hierarchy of bindings of variable names to variable values. The top level of bindings is the gobal environment. As each macro is applied the application creates a unique environment for the macro variables which is popped when the macro finishes.

==== `define` - Definition of Variables

You can define new variable bindings or update existing variables with the `define` macro. The value can be any YAML expansion. Variable names are expected to be strings.

[source, YAML]
----
- define: {name: age, value: 32}
- age
- define: {name: age2, value: [age, age]}
- age2
- define: {name: age2, value: [{define: {name: age, value: 99}}, age]}
- age2
----

Produces:

[source, YAML]
----
- 32
- - 32
- 32
- - 99
----

The result of expanding a `define`, `undefine`, `if` and `include` is a 'magic' value `goyamp.EMPTY`. This value is removed automatically from sequences, and maps if a `define` or `if` has been used there. So it's better to use `define` etc in sequences. When placed in their own document, they disappear completely:

[source, YAML]
----
- define: {name: age, value: 32}
- if: true
else: 23
---
- age
----

Produces:

[source, YAML]
----
- 32
----

This provides a simple way to have conditional map keys, or list items. For example, if we only want a key to appear sometimes, we can use:

[source, YAML]
----
some_map:
this_key_is_always_here: 42
this_key_only_appears_if_$var_is_true:
if: $var
then: 23
----

==== Scalars

Variables can contain any YAML scalar, int float, string, True, False and null.

==== Collections

Variables can contain any YAML collection ie, maps and lists.

==== Variable Expansion

When Goyamp scans YAML it looks for variables in the lists and map values. When one is found it is replaced with the current value of variable binding. It searches the stack of macro bindings until the global environment is reached. If no bindng is found the string is output unchanged.

===== Variables Embedded in Strings

Inside strings, Goyamp will insert expansions delimited by the double-curlies `{{` and `}}`. It's looking for variable names.

[source, YAML]
----
- define: {name: X, value: Christopher}
- define: {name: AXA, value: 'A{{ X }}A'}
---
- AXA
# Produces AChristopherA
----

This processing is also done in map keys so that map keys can be computed during the expansion. For example:

[source, YAML]
----
repeat:
for: loop_variable
in : {range: [1,3] }
body:
'KEY_{{loop_variable}}': some step
----

===== Interpolation with dot syntax

If a string contains periods, such as `data.height` Goyamp looks for a exactly matching variable name, which is expanded with the value. Otherwise the first item (ie `data`) is assumed to be a variable name.

If a binding for the first part is found the value of the variable is assumed to be a collection. The other items which we call sub-variables are used to index the collection (ie `height`). If the collection is a map, the sub-variable name is used as the key. If it is a list the subvariable must evaluate to an integer which is zero-indexed into the list. These subvariable names are also expanded before use so other variables can be used to index the collection.

[source, YAML]
----
- define: { zero: 0 }
- define:
name: data
value:
- type: webserver
hostname: web01
ip: 1.1.2.3
- type: database
hostname: db01
ip: 1.1.2.2
- define: {data.1 : Wednesday}
---
- data.1
- data.1.hostname
- data.zero.hostname
----

Produces

[source, YAML]
----
- Wednesday
- db01
- web01
----

===== Variable Map Keys with the Caret

Normally map keys are not expanded, but with a preceding caret character Goyamp looks up the variable name in the current binding and uses its value. For example:

[source, YAML]
----
- defmacro:
name: my-macro
args: [ param ]
value:
^param:
LtUaE : RU
---
- my-macro: { param: 42 }
----

Evaluates to:

[source, YAML]
----
- 42:
LtUaE: 42
----

This facility even allows macros to be called indirectly since the macro being called is provided by the variable rather than in the code itself. Here's an example, although the practical value of this is yet to surface. This code applies four different macros to the same arguments in turn:

[source, YAML]
----
repeat:
for: macro
in: [+, range, flatten, quote]
body:
^macro: [1, 5]
----

===== Defining Multiple Variables

Declarations don't need the 'name' and 'value' keys, and multiple variables are simultaneously bound.

[source,YAML]
----
- define: { quick: 'shorthand' }
- define:
name: Sara
age: 34
height: 123
----

==== Refactoring Goyamp with `undefine`

Sometimes a variable needs to be renamed or removed. For example if a Goyamp macro name conflicts with a name used in the
output format required. The `undefine` macro removes a variable binding from the current environment. Usage:

[source,YAML]
----
undefine: variablename
----

Used at the top level
(outside of a macro) `undefine` can be used to change the definitions of Goyamp built-in macros themselves. This is done by first assigning a new name with the currently used macro, then undefining the original name. If this is done before any files are included, it can be used to redefine Goyamp syntax. For example we can use `plus` instead of the `+` symbol as follows

[source,YAML]
----
- define:
plus: +
- undefine: +
- {plus: [1,2,3]}
----

=== Macros

Macros are re-usable templates of YAML objects that can be called up almost anywhere in the expansion. They differ from variables becuase they have parameters which are used to fill holes in the template. The are similar to functions, but unlike functions their entire text is always the result. By defining oft-repeated YAML fragments in macros repetitive work is avoided. Also a singular macro definition makes maintainance easy since there is a single defintion for a concept which can be easily changed.

==== Defining with `defmacro`

Macros are defined with the `define` macro which gives the macro a name and specifies the arguments it has and the expansion to return, the body. A macro definition looks like this:

[source,YAML]
----
- defmacro:
name:
args: [, ...]
value:

----

Example - Database upgrade steps:

[source,YAML]
----
defmacro:
name: app-upgrade
args: [appname, dbname]
value:
Database upgrade for {{ appname }}:
- stop application {{ appname }}
- backup app database {{ dbname }}
- upgrade the database {{ dbname }}
- restart the application {{ appname }}
- smoke test {{ appname }}
---
- {app-upgrade: { appname: Netflix, dbname: db8812}}
- app-upgrade:
appname: Stan
dbname: postgres123123
----

Produces:

[source,YAML]
----
- Database upgrade for Netflix:
- stop application Netflix
- backup app database db8812
- upgrade the database db8812
- restart the application Netflix
- smoke test Netflix
- Database upgrade for Stan:
- stop application Stan
- backup app database postgres123123
- upgrade the database postgres123123
- restart the application Stan
- smoke test Stan
----

==== Invoking/calling Macros

As above, macro calls are just maps with a particular structure:
[source, YAML]
----
:
:
...
:
----

==== Macros with no arguments

You can define macros with no arguments at all. Macros can be shorthand for expressions where you compose variables together, run conditions or other processing. The macro has access to all variables in scope where it was defined. For example here is a macro to concatenete variables to make a URL. In this example the macro uses the global (top-level) variables 'base-url' and 'module'.

Example:

[source,YAML]
----
# Definition
- defmacro:
name: api-url
value: "{{base-url}}/{{module}}/list"
---
# Call
api-get:
url: {api-url: } # must have a space after the ':' !
----

Produces

[source,YAML]
----
- api-get:
url: https://foo.org/api/users/list
----

==== Macros with variable arguments

If the arguments in the definition are specified as a string, not a list, the string is the single argument. All the actual arguments at call-time are collected and bound to the variable in a map.

[source,YAML]
----
- defmacro:
name:
args:
value:

----

Example:

[source,YAML]
----
# Definition
- defmacro:
name: package
args: all
value:
name: all.doc
yum:
name: apache
state: all.state

---
# Call
package:
doc: Install apache
name: httpd
state: latest
----

Produces

[source,YAML]
----
name: Install apache
yum:
name: apache
state: latest
----

The disadvantage of vararg macros is that Goyamp cannot ensure that all the required arguments have been supplied in the call.

==== Nesting Macros

Macro calls can be nested i.e. a macro can can contain a call to another in its arguments. Likewise macro definitions can be nested. The macro arguments are lexically scoped, a closure is collected at the time of definition. The macro call executes in the environment in the define-time closure. Macros can call themselves directly or indirectly.

=== Conditional Expansion with `if then else`

The `if` macro renders one value from a choice of two based on whether the condition argument is true. Where true means it's `true` or not `false` or `null`. The `then` argument is expanded if so, otherwise the `else` argument. It's not required to have both `then` and `else` arguments - when the condition requires the missing one, it expands to `null`.

[source,YAML]
----
if:
then:
else:
----

Example:

[source,YAML]
----
# Some variable
define:
application:
name: CSIRAC
has_database: true
arch: valves
---
if: application.has_database
then:
- shutdown database
else:
- shutdown not required
----

Produces:

[source,YAML]
----
- shutdown database
----

Example - short form

[source,YAML]
----
if: true
else: 'This value if false or Null'
----

Produces `null`

=== Testing equality with `==`

Macros can have almost any name, this one is the symbol '=='. It expands to `true` or `false` if the items in the list are equal. Most often used inside an enclosing `if` macro.

[source,YAML]
----
{ ==: [arg1, arg2, ...] }
----

Example:

[source,YAML]
----
{ ==: [1, 1, 10] }
----

Produces the value `false`.

=== Preventing Expansion with `quote`

The `quote` macro does not expand its input arguments returning them unexpanded.

Example:

[source,YAML]
----
- define: { data1: { sub: 2}}
- data1.sub
- quote: data1.sub
----

Produces

[source,YAML]
----
- 2
- data1.sub
----

=== Looping with `repeat`

This macro repeatedly expands the same object, either returning a list or a map. If the `key` argument is present it returns a map, using the `key` argument as the item's key. This must have embedded variables derived from the looping execution otherwise there will be a key collision error. With no `key` argument, it returns a list.

[source,YAML]
----
repeat:
for:
in: [list of items]
key: # Optional
body:
----

Example - returning a dictionary:

[source,YAML]
----
repeat:
for: environment_name
in:
- DEV1
- SVT
- PROD
key: 'Deploy_App_{{environment_name}}'
body:
stage: step
----

Produces:

[source,YAML]
----
Deploy_App_DEV1:
stage: step
Deploy_App_PROD:
stage: step
Deploy_App_SVT:
stage: step
----

Example - returning a list:

[source,YAML]
----
repeat:
for: loop_variable
in: {range: [1,3]}
body:
loop_variable: 'KEY_{{loop_variable}}'
some: step
another:
----

Produces:

[source,YAML]
----
- another: null
loop_variable: KEY_1
some: step
- another: null
loop_variable: KEY_2
some: step
- another: null
loop_variable: KEY_3
some: step
----

Example - looped list with changing keys. Here the keys and values of a child map are changed. :

[source,YAML]
----
repeat:
for: loop_variable
in: {range: [12,13]}
body:
'index_{{loop_variable}}': { +: [100, loop_variable] }
some: step
----

Produces:

[source,YAML]
----
- index_12: 112
some: step
- index_13: 113
some: step
----

=== Looping with `range`

The `range` macro substitutes a list of numbers that can be used in `repeat` macros. (Or anywhere else a list of numbers is needed). The start and end values are passed as a list argument. The range can count up or down, always by one.

[source, YAML]
----
range: [3,5]
----

Produces `[3,4,5]`

`range` also accepts a map object, in which case it expands the sequence of map keys. For example

[source, YAML]
----
- define: {map: {ra: 879, rb: 662}}
- range: map
----

Produces `[ra, rb]`. This can then be used in repeat to loop over the items in a map. Dot notation is used to expand individual members of the map.
For example here the loop variable is set to `ra` then `rb` which `map.keyz` resolves to `879` and `662`:

[source, YAML]
----
repeat:
for: keyz
in: {range: map}
body:
map.keyz
----

Be aware that map keys in data (such as `ra`) might conflict with already defined variables.

=== Combining Lists with `flatten`

Sometimes you need to combine lists, perhaps from different macro expansions. The `flatten` macro combines multiple lists into a single, flat, list. The flattening is recursive. Syntax:

[source,YAML]
----
flatten: < list of objects >
----

For example:

[source,YAML]
----
define: {home-directories: [/home/elvis, /home/madonna]}
---
flatten: [[home-directories], /var, /log]
---
flatten: [1, 2, [3], [[4, 5]], [[[ 6,7]]] ]
----

Produces:

[source,YAML]
----
- /home/elvis
- /home/madonna
- /var
- /log
---
- 1
- 2
- 3
- 4
- 5
- 6
- 7
----

=== Combining One Level of Lists with `flatone`

The `flatone` macro combines multiple lists into a single, flat, list. The flattening is *not* recursive, only the first level is flattened. Syntax:

[source,YAML]
----
flatone: < list of objects >
----

For example:

[source,YAML]
----
flatone: [1, 2, [3], [[4, 5]], [[[ 6,7]]] ]
----

Produces:

[source,YAML]
----
- 1
- 2
- 3
- - 4
- 5
- - - 6
- 7
----

=== Combining Maps with `merge`

The `merge` macro takes a list of maps and merges them togther to make a single map. When there are keys shared between the supplied maps, the program uses the last one seen, it over-writes the earlier value. Hence the order in the list dictates the priority. Merge is NOT recursive, it merges one level of the maps provided. Syntax:

[source,YAML]
----
merge: < list of maps >
----

For example:

[source,YAML]
----
merge:
- { a : 1 }
- { b : 2 }
- { c : 3 , a : -1}
----

Produces:

[source,YAML]
----
a: -1
b: 2
c: 3
----

A more complex example shows combining data from multiple sources:

[source,YAML]
----
- define:
network-data:
hostname: tetris.games.org
- defmacro:
name: mymacro
args: [arg1]
value:
hostname: arg1
ip: 1.1.1.1
app: tetris
- merge:
- { hostname: tetris.home.org }
- { site: Kansas }
- mymacro:
arg1: tetris
- network-data
----

Which boils down to:

[source,YAML]
----
- app: tetris
hostname: tetris.games.org
ip: 1.1.1.1
site: Kansas
----

=== Arithmetic with `+`

The `+` macro adds a list of numbers, int or float.

[source, YAML]
----
+: [1,2,4,8]
----

Produces `15`

=== Reading files with `include`

`include` reads and expands the list of Goyamp YAML files in order. The filenames can be the result of prior macro expansion. So derived filenames like "{{ROOT_DIR}}/{{arch}}/config.yaml" are possible.

[source, YAML]
----
include:
-
-
----

=== Reading Data Files

Sometimes you want to use raw data for parameters and variable values. For example you may have an inventory or database of facts. Goyamp can load YAML or JSON data.

==== Reading Data with `load`

The `load` macro reads a single file of YAML or JSON data and returns the result. No variable substitutions or macro expansions are performed on the data. YAML data is returned as a list, one object for each 'doc'. footnote:[YAML files are subdivided into 'docs' separated by '---']

[source, YAML]
----
{load: }
----

Examples:

[source, YAML]
----
- define: {name: file, value: 'load_data.yaml'}
- define:
name: somedata
value: {load: file}
- define:
movie1: {load: '../test/fixtures/blade-runner.json'}
----

==== Loading Shell Script Data

When you have shell variables in files which you want to use as input to expansion, you can load them into the environment of the Goyamp execution. For example here's a script with some dynamic data:

.data.sh
[source,bash]
----
export VARIABLE1=value1
export VARIABLE2="${VARIABLE1}_value2"
export VARIABLE3="${VARIABLE2}_value3"
----

The shell script must executed to determine the values. To load this into the Goyamp environment, use shell wrappers like this:

[source,bash]
----
$ env -i bash --noprofile --norc -c '. data.sh ; echo env | goyamp'
----

How does this work?

* `env -i bash` creates a bash process with an empty environment.
* `--noprofile --norc` prevent bash from reading profile files on startup
* `-c '. data.sh` sources the shell script in the current (empty) environment
* `echo env | goyamp` runs Goyamp with an input of just `env` - this will output all the environment variables

The YAML output contains the variables we want plus a couple of variables `bash` always needs:

[source, Shell]
----
PWD: /home/birchb/workspace/goyamp
SHLVL: '1'
VARIABLE1: value1
VARIABLE2: value1_value2
VARIABLE3: value1_value2_value3
_: /usr/bin/python
----

=== Executing External Programs with `execute:`

The `execute` builtin runs subprocesses and sends data to and from them. The syntax has two forms,
the first takes a string argument:

[source, yaml]
----
execute:
----

The result is expanded as a string.

The second form allows full control over the execution:

[source, yaml]
----
execute:
command:
args:
environment: < a map of strings containing an environment additions for the process>
directory:

response-type: "string"|"lines"|"json"|"yaml" - default "lines"
request-type: "string"|"lines"|"json"|"yaml" - default "lines"
request:
----

After execution, the stdout of the process is returned as the result processed according to the response-type value. If there is
an error during execution the goyamp process stops with status '2'.

Each argument is used as follows

*command*

This is the name of the file to be run, which should on the `$PATH` or be an absolute path.

*args*

These are the command-line arguments in a seqence of strings.

*environment*

By default, the environment of the subprocess is inherited from the goyamp process. Additional environment variables for the command can be set with `environment`. If the variable already exists the values overwrite existing ones.

[source, yaml]
----
execute:
command: some-script.sh
environment:
USER: overwrites old USER
X: A new variable
----

*directory*

The command is run from the directory specified. The default is the users's current directory. Example:

[source, yaml]
----
execute:
command: cat
directory: "{{__DIR__}}/../test/fixtures"
args: [ blade-runner.json ]
response-type: json
----

*response-type*

When the process runs, output is sent to it's standard output, we'll call that the 'response'. Goyamp reads the response and parses it. `response-type` specifies how goyamp should handle the response from the sub-process. The default is `lines`. The values are:

* `string` - all the response is returned as a single string. Useful for programs like`date`,
* `lines` - a sequence is returned, containing one item for each line of the response,
* `json` - the response is expected to be JSON, it is parsed and returned,
* `yaml` - the response is YAML, the first 'document' in the response is parsed and returned.

*request-type*

Before the process runs, goyamp serialises the `request` data ready to send on the standard input. We'call this data the 'request'. `request-type` specifies how goyamp should print the data. The default is `lines`, the options are:

* `string` - the request is serialised as a single string. Useful for programs like 'bash' which can execute a multi-line string. This provides a way to embed scripts in goyamp files.
* `lines` - a sequence is expected, each item is printed on a seperate line.
* `json` - the request is converted to JSON,
* `yaml` - the request is converted to YAML.

==== Examples of `execute:`
===== An empty environment
To build an empty environment use the Linux `env -i` command in a subshell. For example:

[source, yaml]
----
define:
some_int_variable1: 2342
some_string_variable1: Hello World
---
execute:
command: bash
args: [ -c , '/usr/bin/env -i - inherit1=$some_int_variable1 inherit2="$some_string_variable1" env' ]
response-type: lines
environment:
some_int_variable1: 2342
some_string_variable1: Hello World
----

Produces

[source, yaml]
----
- inherit1=2342
- inherit2=Hello World
----

===== Examples of `response-type`s
_string_

[source, shell]
----
$ echo '{execute: {command: date, args: [+%d.%m.%Y], response-type: string}}' | ./goyamp
---
23.06.2019
----

_lines_ here we get a sequence of ip addresses:

[source, YAML]
----
execute:
command: bash
request-type: string
request: nmap -n -sL 192.168.0.0/30 | grep 'Nmap scan report for' | awk '{print $5}'
response-type: lines
----

Which produces:

[source, YAML]
----
- 192.168.0.0
- 192.168.0.1
- 192.168.0.2
- 192.168.0.3
----

_json_ in this example we extract information about the CPUS on the machine

[source, YAML]
----
execute:
command: facter
args: [--json, processors]
response-type: json
----

Produces:

[source, YAML]
----
processors:
count: 2
models:
- Intel(R) Core(TM)2 Duo CPU P8400 @ 2.26GHz
- Intel(R) Core(TM)2 Duo CPU P8400 @ 2.26GHz
physicalcount: 1
----

_yaml_ example:

[source, YAML]
----
execute:
command: cat
directory: $fixtures
args: [ variety.yaml ]
response-type: yaml
----

===== Examples of `response-type`s

_string_ Here's a multipline Python script to print a list of dates embedded in YAML

[source, YAML]
----
execute:
command: python
response-type: lines
request-type: string
request: |
from datetime import timedelta, date

def daterange(start_date, end_date):
for n in range(int ((end_date - start_date).days)):
yield start_date + timedelta(n)

start_date = date(2019, 1, 1)
end_date = date(2019, 1, 5)
for single_date in daterange(start_date, end_date):
print single_date.strftime("%Y-%m-%d")
----

Produces:

[source, YAML]
----
- "2019-01-01"
- "2019-01-02"
- "2019-01-03"
- "2019-01-04"
----

_lines_ Here we sort a list of hostnames in a sequence, and get that back as a sequence:

[source, YAML]
----
defmacro:
name: $sort
args: $items
value:
execute:
command: sort
response-type: lines
request-type: lines
request: $items
---
$sort:
- ip-12-34-56-78.us-west-2.compute.internal
- ec2-12-43-56-78.ap-southeast-2.compute.amazonaws.com
- ip-12-34-56-78.us-east-2.compute.internal
- ip-12-34-65-99.us-west-2.compute.internal
- ec2-12-34-56-78.ap-southeast-2.compute.amazonaws.com
----

Produces:

[source, YAML]
----
- ec2-12-34-56-78.ap-southeast-2.compute.amazonaws.com
- ec2-12-43-56-78.ap-southeast-2.compute.amazonaws.com
- ip-12-34-56-78.us-east-2.compute.internal
- ip-12-34-56-78.us-west-2.compute.internal
- ip-12-34-65-99.us-west-2.compute.internal
----

_json_ In this example we use curl to get JSON data from the GitHub API - a set of commit messages.
Then we send the data as JSON to 'jq' which filters it.

[source, YAML]
----
execute:
command: jq
args: ["[.[] | {message: .commit.message, name: .commit.committer.name}]"]
request-type: json
response-type: json
request:
execute:
command: curl
args: ["https://api.github.com/repos/birchb1024/goyamp/commits?per_page=3"]
response-type: json
----

Produces:

[source, YAML]
----
- message: Add execute. Change from __PATH__ to __DIR__. Add pwd as __DIR__
name: Peter William Birch
- message: Additions to execute (still in progress)
name: Peter William Birch
- message: Add Stringer() to yamly. Fail on undefined in {{}}
name: Bill Birch
----

=== Executing Lua Scripts with the Embedded Interpreter

You can make complex manipulations of the YAML data with the Lua 5.1 interpreter embedded in Goyamp. 'Gopher Lua' is written in 100% Go language. You can read about https://github.com/yuin/gopher-lua:[gopherlua here], and https://www.lua.org/manual/5.1/:[Lua 5.1 here].

To use Lua you invoke the interpreter with the `gopherlua:` key and pass it a YAML structure. The YAML structure is converted into Lua tables and set in the `args` global variable where your script can access it. At the end of execution you pass data back to Goyamp in the `results` Lua global variable. This becomes the value of the gopherlua map which is substituted in the output.

Each time you invoke `gopherlua:`, a new interpreter is created, and destroyed at the end. The Lua initialisation process is:

1. The location of the goyamp binary file is determined and saved to the global variable `executable_directory`

2. The package.path variable is set to the value of environment variable `+__DIR__/?.lua;LUA_PATH+` if `LUA_PATH` is present,
otherwise the package.path variable is set to the default `+__DIR__/?.lua;./?.lua;./?.lc;/lib/?.lua;/lib/?.lc+`. This means Lua will pick up files in `require()` calls from the `lib/` directory wherever goyamp is installed. It will also pick up scripts relative to the current YAML file.

3. The Lua interpreter attempts to require `init.lua` from the package.path. If it isn't present there is no error or warning message unless you run with `-d`.

4. Then the global variable `args` is set to the value of the YAML args: element. The string in script: is executed, and the value of `result` is returned to Goyamp.

Goyamp uses these global variables inside the Lua interpreter:

* `+__DIR__+` - Directory containing the current enclosing YAML script
* `args` - Holds the input args: argument
* `executable_directory` - Directory holding the gymap binary, useful for path manipulations,
* `result` - where the result of the Lua execution is placed for return to Goyamp
* `seqy` - the metatable attached to YAML sequence (list) tables
* `mapy` - the metatable for YAML map tables
* `nily` - variable contains the userdata object used for YAML null values

The gopherlua: syntax is as follows:

[source, YAML]
----
gopherlua:
args: # this is where you pass a YAML structure to Lua
script: # This a Lua script which is executed.
----

Here are some examples:

To return an uppercase version of a string we use the Lua string.upper() function.

[source, YAML]
----
gopherlua:
args: we are groot
script: "result = string.upper(args)"
----

To sort a list we can use the Lua table.sort() function.

[source, YAML]
----
gopherlua:
args: [X,K,A]
script: "table.sort(args); result = args"
----

Here is a more complex example. We want to turn all the elements in a YAML structure
to uppercase. Granted this could be done with shell tools, but this example shows are recursive
tree walk function. YAML allows multi-line strings which are convenient for medium length
scripts. Longer scripts can be put into source files and loaded by Lua with `require()`.

[source, YAML]
----
gopherlua:
args:
a:3 : 22
str: "a lower case string"
arr: { x: , y: 2 }
list: [1,2,3]
script: |-

-- Uppercase all strings in a YAML tree
function uppertree(t)
local tt = type(t)
if tt == "string" then
return string.upper(t)
elseif tt == "table" then
local k, v = next(t, nil)
local result = {}
while k do
if type(k) == "string" then
result[string.upper(k)] = uppertree(v)
else
result[k] = uppertree(v)
end
k, v = next(t, k)
end
return result
else
return t
end
end
result = uppertree(args)
----

This example shows how to load a standalone Lua file using `require()`. Having Lua code in separate files is handy since your favourite editor will give you syntax highlighting and formatting. You can also run your Lua scripts 'offline' with the gopher-lua standalone executable, `glua` which can be gotten from https://github.com/yuin/gopher-lua#standalone-interpreter:[here].

In this example we have stashed the script in a YAML variable, `$deepmerge`. This allows us to use it in many different `gopherlua:` calls. The file `deepmerge.lua` is in the goyamp release in the lib/ directory.

[source, YAML]
----
define:
$deepmerge: |-
dm = require('deepmerge')
result = dm.deep_merge(args[1], args[2])
---
gopherlua:
script: $deepmerge
args: etc, etc
----

In Lua you can add more to the path with `package.path = package.path ... ";my/directory/?.lua"`.

==== Lua Conversion

Lua tables use `nil` to signal absence of an entry rather than holding the value `nil`. To work around this, Goyamp converts YAML `null` to a `userdata` object which is stored in the global Lua variable `nily`.

Lua does not have a separate data types for arrays and maps, it uses the `table` type for both of these. Hence issues arise when working with YAML data which _does_ differentiate. This is handled by the custom metatables `mapy` and `seqy`. When collections are transfered to Gopher Lua their metatables are set to either `mapy` or `seqy`. Likewise when a Lua table is ambiguous in a result you can clarify this with `setmetatable`. For example `setmetatable(x, mapy)` ensures that Goymap sees this result item as a map.

=== Quitting Early with `exit`

Sometimes you will want the script simply stop processing. The `exit:` builtin halts execution by calling the operating system exit() function. You can provide the status for the process as an argument. If the argument is empty, null or 0 or the string `"0"`, the process status is zero. If an integer or a string containing an integer is provided this becomes the status of the process termination.

Example:

[source, YAML]
----
if:
==: [p1, p2]
else:
exit: 3
----

This quietly exits with code 3.

=== Enforcing Safety with `panic`

When processing becomes more complex you may want to implement checks on input data. The `panic:` macro
halts execution and prints a message supplied. With this combined with ==: you can code a variety
of check macros. For example here is a macro that ends processing if two things do not match:

[source, YAML]
----
defmacro:
name: assert_equal
args: [p1, p2]
value:
if:
==: [p1, p2]
else:
panic: "ASSERT FAILED {{p1}} != {{p2}} {{__SOURCE__}}"
---
assert_equal:
p1: 12
p2: 23
----

Produces this on stderr:

[source, shell]
----
panic: ASSERT FAILED 12 != 23 { assert_equal : { p1 : 12 , p2 : 23 } }
----

With the `-d` command-line option, a backtrace is also printed.

=== Builtin Variables

Goyamp automatically populates some variables as it executes. These are:

* `env` - the process environment

* `argv` - the command line arguments

* `+__VERSION__+` - the Goyamp version number

* `+__FILE__+` - the current source filename

* `+__DIR__+` - the directory pathname of the current source file

* `+__SOURCE__+` - the expression passed into the currently executing macro - useful for debugging your macros.

== Using the Goyamp Go Module

The goymap Go module can be used as a component to other programs. The 'main' of goyamp itself uses the modules API and can be used as an example. Here is a simplified version:

[source, Go]
----

// Import the module
import (
"github.com/birchb1024/goyamp"
)

// Create an instance of the macro-processor engine
// providing a list of command arguments, an environment, an output writer and an output format flag.

engine := goyamp.NewExpander(commandArgs, os.Environ(), os.Stdout, outFormat)

// either process a stream, giving a Reader
err := engine.ExpandStream(os.Stdin, "-")
if err != nil {
panic(err)
}

// or process a file
err := engine.ExpandFile("examples/macros.yaml")
if err != nil {
panic(err)
}

----
== Maintenance of Goyamp

=== Build from Source

Source code is in GitHub https://github.com/:birchb1024/goyamp:[here].

First install dependencies (Ubuntu)

[source, Shell]
----

$ sudo apt install asciidoctor
$ sudo apt install source-highlight
: Install the source-highlighter for YAML - Following these instructions https://gist.github.com/AlexZeitler/48813447f253360ccc431ae22d6939fd

$ sudo -H bash
$ curl https://gist.githubusercontent.com/AlexZeitler/48813447f253360ccc431ae22d6939fd/raw/1c1d9372cce5fb2b568b2dd953d334ef8fe3f33d/yaml.lang > /usr/share/source-highlight/yaml.lang
$ for X in yml yaml
do
echo "$X = yaml.lang" >> /usr/share/source-highlight/lang.map
done
----

Build

[source, Shell]
----
$ git clone https://github.com/birchb1024/goyamp
$ cd goyamp
$ build.sh # For executables

$ build.sh coverage # Test coverage detailed report

$ build.sh doc # For Asciidoc to HTML

$ build.sh package # To make a releasable tar file with document, examples and executables.
----

=== Code

Run the unit tests with `cd test; go test`

=== Updating This Document

This document is in http://www.methods.co.nz/asciidoc/:[AsciiDoc] format. Use the Linux `asciidoc` packages. To Highlight the YAML syntax also install `source-highlight` and the https://gist.github.com/AlexZeitler/48813447f253360ccc431ae22d6939fd[YAML syntax module]. Save the HTML version in `doc/README.html`.

=== Known Issues

See the `Issues` in the https://github.com/birchb1024/goyamp:[Goyamp GitHub project]