Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/triglav-dataflow/triglav-agent-bigquery
BigQuery agent for Triglav, data-driven workflow tool
https://github.com/triglav-dataflow/triglav-agent-bigquery
bigquery ruby triglav-agent
Last synced: about 1 month ago
JSON representation
BigQuery agent for Triglav, data-driven workflow tool
- Host: GitHub
- URL: https://github.com/triglav-dataflow/triglav-agent-bigquery
- Owner: triglav-dataflow
- License: mit
- Created: 2017-02-24T05:28:43.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2017-04-22T02:15:38.000Z (over 7 years ago)
- Last Synced: 2024-09-29T06:21:38.148Z (about 2 months ago)
- Topics: bigquery, ruby, triglav-agent
- Language: Ruby
- Homepage:
- Size: 58.6 KB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE.txt
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
# Triglav::Agent::Bigquery
Triglav Agent for BigQuery
## Requirements
* Ruby >= 2.3.0
## Prerequisites
* BigQuery view is not supported
## Installation
Add this line to your application's Gemfile:
```ruby
gem 'triglav-agent-bigquery'
```And then execute:
$ bundle
Or install it yourself as:
$ gem install triglav-agent-bigquery
## CLI
```
Usage: triglav-agent-bigquery [options]
-c, --config VALUE Config file (default: config.yml)
-s, --status VALUE Status stroage file (default: status.yml)
-t, --token VALUE Triglav access token storage file (default: token.yml)
--dotenv Load environment variables from .env file (default: false)
-h, --help help
--log VALUE Log path (default: STDOUT)
--log-level VALUE Log level (default: info)
```Run as:
```
TRIGLAV_ENV=development bundle exec triglav-agent-bigquery --dotenv -c config.yml
```## Configuration
Prepare config.yml as [example/config.yml](./example/config.yml).
You can use erb template. You may load environment variables from .env file with `--dotenv` option as an [example/example.env](./example/example.env) file shows.
### serverengine section
You can specify any [serverengine](https://github.com/fluent/serverengine) options at this section
### triglav section
Specify triglav api url, and a credential to authenticate.
The access token obtained is stored into a token storage file (--token option).
### bigquery section
This section is the special section for triglav-agent-bigquery.
* **monitor_interval**: The interval to watch tables (number, default: 60)
* **connection_info**: key-value pairs of bigquery connection info where keys are resource URI pattern in regular expression, and values are connection infomation
* **auth_method**: Authentication method. Must be one of `service_account`, `authorized_user` (for oauth2), `compute_engine`, and `application_default`. Default obtains from credentials.
* **credentials_file**: Credentials file path such as service account json.
* **credentials**: Instead of `credentials_file`, you may pass json contents as a string### Specification of Resource URI
Resource URI must be a form of:
```
https://bigquery.cloud.google.com/table/#{project}:#{dataset}.#{table}
````#{table}` also accepts strftime formatted suffix such as
```
#{table}_%Y%m%d
```and strftime formatted partition decorator for a partitioned table such as
```
#{table}$%Y%m%d
```## How it behaves
1. Authenticate with triglav
* Store the access token into the token storage file
* Read the token from the token storage file next time
* Refresh the access token if it is expired
2. Repeat followings in `monitor_interval` seconds:
3. Obtain resource (table) lists of the specified prefix (keys of connection_info) from triglav.
4. Connect to bigquery with an appropriate connection info for a resource uri, and find tables which are newer than last check.
5. Store checking information into the status storage file for the next time check.## Development
### Prepare
```
./prepare.sh
```Edit `.env` or `config.yml` file directly.
### Start
Start up triglav api on localhost.
Run triglav-anget-bigquery as:
```
TRIGLAV_ENV=development bundle exec triglav-agent-bigquery --dotenv --debug -c example/config.yml
```The debug mode with --debug option ignores the `last_modified_time` value in status file.
## Contributing
Bug reports and pull requests are welcome on GitHub at https://github.com/triglav-dataflow/triglav-agent-bigquery. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [Contributor Covenant](http://contributor-covenant.org) code of conduct.
## License
The gem is available as open source under the terms of the [MIT License](http://opensource.org/licenses/MIT).