https://github.com/quortex/influxdb-athena-crawler
An AWS Athena crawler for InfluxDB.
https://github.com/quortex/influxdb-athena-crawler
dataplane microservice
Last synced: 2 months ago
JSON representation
An AWS Athena crawler for InfluxDB.
- Host: GitHub
- URL: https://github.com/quortex/influxdb-athena-crawler
- Owner: quortex
- License: apache-2.0
- Created: 2021-06-29T11:19:59.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2026-01-13T16:55:39.000Z (2 months ago)
- Last Synced: 2026-01-13T18:44:42.186Z (2 months ago)
- Topics: dataplane, microservice
- Language: Go
- Size: 188 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# influxdb-athena-crawler
An AWS Athena crawler for InfluxDB.
## Overview
This project is a utility designed to get AWS Athena results (CSV objects stored in AWS S3), parse them and write InfluxDB points.
## Prerequisites
To be used with AWS and interact with the s3 bucket, an AWS account with the following permissions on s3 is required (note that `s3:DeleteObject` is only required if clean-objects is set):
```json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["s3:ListBucket"],
"Resource": ""
},
{
"Effect": "Allow",
"Action": ["s3:ListObjects", "s3:GetObject", "s3:DeleteObject"],
"Resource": "/*"
}
]
}
```
## Installation
### Helm (Kubernetes install)
Follow influxdb-athena-crawler documentation for Helm deployment [here](./helm/influxdb-athena-crawler).
## Configuration
influxdb-athena-crawler takes as argument the parameters below.
| Key | Description | Default |
| ---------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------- |
| region | The AWS region. | `""` |
| bucket | The AWS bucket to watch. | `""` |
| prefix | The bucket prefix. | `""` |
| suffix | Filename suffix to restrict files processed on the bucket. | `""` |
| clean-objects | Whether to delete S3 objects after processing them. | `false` |
| max-object-age | How long to wait since last modification before file cleaning. | `10m` |
| timeout | The global timeout. | `"30s"` |
| influx-server | The InfluxDB server address. | `""` |
| influx-token | The InfluxDB token. | `""` |
| influx-org | The InfluxDB org to write to. | `""` |
| influx-bucket | The InfluxDB bucket write to. | `""` |
| measurement | A measurement acts as a container for tags, fields, and timestamps. Use a measurement name that describes your data. | `""` |
| timestamp-row | The timestamp row in CSV. | `"timestamp"` |
| timestamp-layout | The layout to parse timestamp. | `"2006-01-02T15:04:05.000Z"` |
| tag | Tags to add to InfluxDB point. Could be of the form `--tag=foo` if tag name matches CSV row or `--tag='foo={row:bar}'` to specify row. | `""` |
| field | Fields to add to InfluxDB point. Could be of the form `--field='foo={type:int,row:bar}'`, if not specified, CSV row matches field name. Type can be float, int, string or bool. | `""` |
| max-routines | The max number of concurrent object processing routines. | `100` |
## License
Distributed under the Apache 2.0 License. See `LICENSE` for more information.
## Versioning
We use [SemVer](http://semver.org/) for versioning.
## Help
Got a question?
File a GitHub [issue](https://github.com/quortex/influxdb-athena-crawler/issues).