Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/aCLImatise/CliHelpParser

Reads the output from CLI help commands, and generates machine readable schemas (CWL etc)
https://github.com/aCLImatise/CliHelpParser

bioinformatics cli cwl parser pipeline wdl workflow

Last synced: 2 months ago
JSON representation

Reads the output from CLI help commands, and generates machine readable schemas (CWL etc)

Awesome Lists containing this project

README

        

aCLImatise
***********
|DOI|

.. |DOI| image:: https://zenodo.org/badge/DOI/10.1093/bioinformatics/btaa1033.svg
:target: https://doi.org/10.1093/bioinformatics/btaa1033

For the full documentation, refer to the `Github Pages Website
`_.

======================================================================

aCLImatise is a Python library and command-line utility for parsing the help output
of a command-line tool and then outputting a description of the tool in a more
structured format, for example a
`Common Workflow Language tool definition `_.

Currently aCLImatise supports both `CWL `_ and
`WDL `_ outputs, but other formats will be considered in the future, especially pull
requests to support them.

Please also refer to `The aCLImatise Base Camp `_, which is a database of pre-computed tool definitions
generated by the aCLImatise parser. Most bioinformatics tools have a tool definition already generated in the Base Camp,
so you may not need to run aCLImatise directly.

aCLImatise is now published in the journal *Bioinformatics*. You can read the application note here: https://doi.org/10.1093/bioinformatics/btaa1033.
To cite aCLImatise, please use the citation generator provided by the journal.

Example
-------

Lets say you want to create a CWL workflow containing the common Unix ``wc`` (word count)
utility. Running ``wc --help`` returns:

.. code-block::

Usage: wc [OPTION]... [FILE]...
or: wc [OPTION]... --files0-from=F
Print newline, word, and byte counts for each FILE, and a total line if
more than one FILE is specified. A word is a non-zero-length sequence of
characters delimited by white space.

With no FILE, or when FILE is -, read standard input.

The options below may be used to select which counts are printed, always in
the following order: newline, word, character, byte, maximum line length.
-c, --bytes print the byte counts
-m, --chars print the character counts
-l, --lines print the newline counts
--files0-from=F read input from the files specified by
NUL-terminated names in file F;
If F is - then read names from standard input
-L, --max-line-length print the maximum display width
-w, --words print the word counts
--help display this help and exit
--version output version information and exit

GNU coreutils online help:
Full documentation at:
or available locally via: info '(coreutils) wc invocation'

If you run ``aclimatise explore wc``, which means "parse the wc command and all subcommands",
you'll end up with the following files in your current directory:

* ``wc.cwl``
* ``wc.wdl``
* ``wc.yml``

These are representations of the command ``wc`` in 3 different formats. If you look at ``wc.wdl``, you'll see that it
contains a WDL-compatible tool definition for ``wc``:

.. code-block:: text

version 1.0
task Wc {
input {
Boolean bytes
Boolean chars
Boolean lines
String files__from
Boolean max_line_length
Boolean words
}
command <<<
wc \
~{true="--bytes" false="" bytes} \
~{true="--chars" false="" chars} \
~{true="--lines" false="" lines} \
~{if defined(files__from) then ("--files0-from " + '"' + files__from + '"') else ""} \
~{true="--max-line-length" false="" max_line_length} \
~{true="--words" false="" words}
>>>
}