Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/heuermh/sea-eagle
Command line tools for AWS Athena.
https://github.com/heuermh/sea-eagle
athena cli command-line command-line-tool tui
Last synced: 9 days ago
JSON representation
Command line tools for AWS Athena.
- Host: GitHub
- URL: https://github.com/heuermh/sea-eagle
- Owner: heuermh
- License: apache-2.0
- Created: 2023-11-03T15:24:31.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-08-26T17:43:58.000Z (2 months ago)
- Last Synced: 2024-08-26T20:56:25.090Z (2 months ago)
- Topics: athena, cli, command-line, command-line-tool, tui
- Language: Java
- Homepage:
- Size: 79.1 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 6
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE
- Citation: CITATION.cff
- Codemeta: codemeta.json
Awesome Lists containing this project
README
# sea-eagle
Command line tools for [AWS Athena](https://aws.amazon.com/athena/).
## Hacking sea-eagle
Install
* JDK 17 or later, https://openjdk.java.net
* Apache Maven 3.6.3 or later, https://maven.apache.orgTo build
```bash
$ mvn package$ export PATH=$PATH:`pwd`/target/appassembler/bin
```## Using sea-eagle
### Usage
```bash
$ se --help
USAGE
se [-hV] [--skip-header] [--skip-history] [-b=] [-c=] [-d=] [-f=]
[-i=] [--left-pad=] [-n=] [-o=] [-q=] [-w=]
[-p=]... [COMMAND]OPTIONS
-c, --catalog= Catalog name, if any.
-d, --database= Database name, if any.
-w, --workgroup= Workgroup, default primary.
-b, --output-location= Output location, if workgroup is not provided.
-n, --polling-interval= Query status polling interval, default 250 ms.
--skip-header Skip writing header to results.
--skip-history Skip writing query to history file.
-q, --query= Inline SQL query, if any.
-i, --query-path= SQL query input path, default stdin.
-p, --execution-parameters= SQL query execution parameters, if any.
-o, --results-path= Query results path, default stdout.
-f, --format, --results-format= Query results format { pretty, sparse, text, parquet }, default text.
--left-pad= Left pad query results, default 2 for pretty and sparse formats.
--verbose Show additional logging messages.
-h, --help Show this help message and exit.
-V, --version Print version information and exit.COMMANDS
help Display help information about the specified command.
generate-completion Generate bash/zsh completion script for se.
```### Environment variables
Note the `catalog`, `database`, `workgroup`, and `output-location` options can also be specified by
environment variables `SE_CATALOG`, `SE_DATABASE`, `SE_WORKGROUP`, and `SE_OUTPUT_LOCATION`, respectively
```bash
$ export SE_CATALOG=catalog$ SE_WORKGROUP=workgroup \
se \
... \
```### SQL queries
SQL queries can be provided inline via the `-q`/`--query` option
```bash
$ se \
... \
--query "SELECT * FROM table LIMIT 4"
```By default the SQL query is read from `stdin`
```bash
$ echo "SELECT * FROM table LIMIT 4" | se \
... \
```Or the SQL query can be read from a file via the `-i`/`--query-path` option
```bash
$ echo "SELECT * FROM table LIMIT 4" > query.sql$ se \
... \
--query-path query.sql
```### Execution parameters
SQL queries may contain `?`-style execution parameters to be substituted server side
```bash
$ se \
... \
--query "SELECT * FROM table WHERE foo = ? AND bar > ? LIMIT 4" \
--execution-parameters baz \
--execution-parameters 100000
```Alternatively, variable substition can be done via e.g. `envsubst` on the client side
```bash
$ echo "SELECT * FROM table WHERE foo = '$FOO' LIMIT 4" > query.sql$ export FOO=baz
$ envsubst < query.sql | se \
... \
```### SQL query history file
SQL queries are written to a history file `~/.se_history`, unless `--skip-history` flag is present
```bash
$ se \
... \
--query "SELECT * FROM table LIMIT 4"$ se \
... \
--skip-history \
--query "SELECT * FROM table WHERE foo = 'top secret!!' LIMIT 4"$ cat ~/.se_history
SELECT * FROM table LIMIT 4
```### Output formats
#### Text and display formats
By default, results are written to `stdout` in tab-delimited text format.
This allows for easy integration with command line tools such as `cut`, `grep`, `awk`, `sed`,
`uniq`, etc. for post-processing.```bash
$ se \
... \
--query "SELECT * FROM table LIMIT 2"foo bar baz
2088090022 185762 232298
2044078009 113652 85962$ se \
... \
--query "SELECT * FROM table LIMIT 4" \
--skip-header | cut -f 4 | sort -n26603
67310
116988
164738
```Results may be formatted for display in the terminal, in sparse
```bash
$ se \
... \
--query "SELECT * FROM table LIMIT 4" \
--format sparsefoo bar baz
--------- --------- ---------
1499494 2354616 5560703
516330 758111 1623718
113663 192870 137600
1028323 960709 850306
```and pretty formats
```bash
$ se \
... \
--query "SELECT * FROM table LIMIT 4" \
--format pretty+---------+---------+---------+
| foo | bar | baz |
+---------+---------+---------+
| 1088718 | 1779849 | 5096779 |
| 17560 | 40360 | 32204 |
| 84 | 8273 | 47681 |
| 52383 | 100406 | 86338 |
+---------+---------+---------+
```Results may be written to a file (and optionally compressed) via the `-o`/`--results-path` option
```bash
$ se \
... \
--query "SELECT * FROM table LIMIT 4" \
--results-path results.txt.zstd
```#### Parquet format
Finally, results may be written out to a local Parquet file
```bash
$ se \
... \
--query "SELECT * FROM table LIMIT 4" \
--format parquet
--results-path results.parquet
```...which can easily be loaded into e.g. [duckdb](https://duckdb.org/) for further post-processing.
```sql
$ duckdbD SELECT * FROM read_parquet("results.parquet");
┌─────────┬─────────┬─────────┐
│ foo │ bar │ baz │
│ int64 │ int64 │ int64 │
├─────────┼─────────┼─────────┤
│ 1670466 │ 2455819 │ 5386130 │
│ 1427967 │ 1990921 │ 3779556 │
│ 66473 │ 97877 │ 73903 │
│ 7767 │ 7766 │ 5888 │
├─────────┴─────────┴─────────┤
│ 4 rows 3 columns │
└─────────────────────────────┘
```