Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/heuermh/sea-eagle

Command line tools for AWS Athena.
https://github.com/heuermh/sea-eagle

athena cli command-line command-line-tool tui

Last synced: 9 days ago
JSON representation

Command line tools for AWS Athena.

Host: GitHub
URL: https://github.com/heuermh/sea-eagle
Owner: heuermh
License: apache-2.0
Created: 2023-11-03T15:24:31.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2024-08-26T17:43:58.000Z (2 months ago)
Last Synced: 2024-08-26T20:56:25.090Z (2 months ago)
Topics: athena, cli, command-line, command-line-tool, tui
Language: Java
Homepage:
Size: 79.1 KB
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 6
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE
- Citation: CITATION.cff
- Codemeta: codemeta.json

Awesome Lists containing this project

README

        # sea-eagle

Command line tools for [AWS Athena](https://aws.amazon.com/athena/).

## Hacking sea-eagle

Install

 * JDK 17 or later, https://openjdk.java.net

 * Apache Maven 3.6.3 or later, https://maven.apache.org

To build

```bash

$ mvn package

$ export PATH=$PATH:`pwd`/target/appassembler/bin

```

## Using sea-eagle

### Usage

```bash

$ se --help

USAGE

  se [-hV] [--skip-header] [--skip-history] [-b=] [-c=] [-d=] [-f=]

     [-i=] [--left-pad=] [-n=] [-o=] [-q=] [-w=]

     [-p=]... [COMMAND]

OPTIONS

  -c, --catalog=                          Catalog name, if any.

  -d, --database=                        Database name, if any.

  -w, --workgroup=                      Workgroup, default primary.

  -b, --output-location=           Output location, if workgroup is not provided.

  -n, --polling-interval=         Query status polling interval, default 250 ms.

      --skip-header                                Skip writing header to results.

      --skip-history                               Skip writing query to history file.

  -q, --query=                              Inline SQL query, if any.

  -i, --query-path=                     SQL query input path, default stdin.

  -p, --execution-parameters= SQL query execution parameters, if any.

  -o, --results-path=                 Query results path, default stdout.

  -f, --format, --results-format=   Query results format { pretty, sparse, text, parquet }, default text.

      --left-pad=                         Left pad query results, default 2 for pretty and sparse formats.

      --verbose                                    Show additional logging messages.

  -h, --help                                       Show this help message and exit.

  -V, --version                                    Print version information and exit.

COMMANDS

  help                 Display help information about the specified command.

  generate-completion  Generate bash/zsh completion script for se.

```

### Environment variables

Note the `catalog`, `database`, `workgroup`, and `output-location` options can also be specified by

environment variables `SE_CATALOG`, `SE_DATABASE`, `SE_WORKGROUP`, and `SE_OUTPUT_LOCATION`, respectively

```bash

$ export SE_CATALOG=catalog

$ SE_WORKGROUP=workgroup \

    se \

      ... \

```

### SQL queries

SQL queries can be provided inline via the `-q`/`--query` option

```bash

$ se \

    ... \

    --query "SELECT * FROM table LIMIT 4"

```

By default the SQL query is read from `stdin`

```bash

$ echo "SELECT * FROM table LIMIT 4" | se \

    ... \

```

Or the SQL query can be read from a file via the `-i`/`--query-path` option

```bash

$ echo "SELECT * FROM table LIMIT 4" > query.sql

$ se \

    ... \

    --query-path query.sql

```

### Execution parameters

SQL queries may contain `?`-style execution parameters to be substituted server side

```bash

$ se \

    ... \

    --query "SELECT * FROM table WHERE foo = ? AND bar > ? LIMIT 4" \

    --execution-parameters baz \

    --execution-parameters 100000

```

Alternatively, variable substition can be done via e.g. `envsubst` on the client side

```bash

$ echo "SELECT * FROM table WHERE foo = '$FOO' LIMIT 4" > query.sql

$ export FOO=baz

$ envsubst < query.sql | se \

    ... \

```

### SQL query history file

SQL queries are written to a history file `~/.se_history`, unless `--skip-history` flag is present

```bash

$ se \

    ... \

    --query "SELECT * FROM table LIMIT 4"

$ se \

    ... \

    --skip-history \

    --query "SELECT * FROM table WHERE foo = 'top secret!!' LIMIT 4"

$ cat ~/.se_history

SELECT * FROM table LIMIT 4

```

### Output formats

#### Text and display formats

By default, results are written to `stdout` in tab-delimited text format.

This allows for easy integration with command line tools such as `cut`, `grep`, `awk`, `sed`,

`uniq`, etc. for post-processing.

```bash

$ se \

    ... \

    --query "SELECT * FROM table LIMIT 2"

foo	bar	baz

2088090022	185762	232298

2044078009	113652	85962

$ se \

    ... \

    --query "SELECT * FROM table LIMIT 4" \

    --skip-header | cut -f 4 | sort -n

26603

67310

116988

164738

```

Results may be formatted for display in the terminal, in sparse

```bash

$ se \

    ... \

    --query "SELECT * FROM table LIMIT 4" \

    --format sparse

      foo       bar       baz

   --------- --------- ---------

    1499494   2354616   5560703

     516330    758111   1623718

     113663    192870    137600

    1028323    960709    850306

```

and pretty formats

```bash

$ se \

    ... \

    --query "SELECT * FROM table LIMIT 4" \

    --format pretty

  +---------+---------+---------+

  |   foo   |   bar   |   baz   |

  +---------+---------+---------+

  | 1088718 | 1779849 | 5096779 |

  |   17560 |   40360 |   32204 |

  |      84 |    8273 |   47681 |

  |   52383 |  100406 |   86338 |

  +---------+---------+---------+

```

Results may be written to a file (and optionally compressed) via the `-o`/`--results-path` option

```bash

$ se \

    ... \

    --query "SELECT * FROM table LIMIT 4" \

    --results-path results.txt.zstd

```

#### Parquet format

Finally, results may be written out to a local Parquet file

```bash

$ se \

    ... \

    --query "SELECT * FROM table LIMIT 4" \

    --format parquet

    --results-path results.parquet

```

...which can easily be loaded into e.g. [duckdb](https://duckdb.org/) for further post-processing.

```sql

$ duckdb

D SELECT * FROM read_parquet("results.parquet");

┌─────────┬─────────┬─────────┐

│   foo   │   bar   │   baz   │

│  int64  │  int64  │  int64  │

├─────────┼─────────┼─────────┤

│ 1670466 │ 2455819 │ 5386130 │

│ 1427967 │ 1990921 │ 3779556 │

│   66473 │   97877 │   73903 │

│    7767 │    7766 │    5888 │

├─────────┴─────────┴─────────┤

│ 4 rows            3 columns │

└─────────────────────────────┘

```