https://github.com/quortex/influxdb-athena-crawler

An AWS Athena crawler for InfluxDB.
https://github.com/quortex/influxdb-athena-crawler

dataplane microservice

Last synced: 4 months ago
JSON representation

An AWS Athena crawler for InfluxDB.

Host: GitHub
URL: https://github.com/quortex/influxdb-athena-crawler
Owner: quortex
License: apache-2.0
Created: 2021-06-29T11:19:59.000Z (almost 5 years ago)
Default Branch: main
Last Pushed: 2026-01-13T16:55:39.000Z (5 months ago)
Last Synced: 2026-01-13T18:44:42.186Z (5 months ago)
Topics: dataplane, microservice
Language: Go
Size: 188 KB
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 7
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # influxdb-athena-crawler

An AWS Athena crawler for InfluxDB.

## Overview

This project is a utility designed to get AWS Athena results (CSV objects stored in AWS S3), parse them and write InfluxDB points.

## Prerequisites

### AWS

To be used with AWS and interact with the s3 bucket, an AWS account with the following permissions on s3 is required (note that `s3:DeleteObject` is only required if clean-objects is set):

```json

{

  "Version": "2012-10-17",

  "Statement": [

    {

      "Effect": "Allow",

      "Action": ["s3:ListBucket"],

      "Resource": ""

    },

    {

      "Effect": "Allow",

      "Action": ["s3:ListObjects", "s3:GetObject", "s3:DeleteObject"],

      "Resource": "/*"

    }

  ]

}

```

## Installation

### Helm (Kubernetes install)

Follow influxdb-athena-crawler documentation for Helm deployment [here](./helm/influxdb-athena-crawler).

## Configuration

### Optional args

influxdb-athena-crawler takes as argument the parameters below.

| Key | Description | Default |

| ---------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------- |

| region | The AWS region. | `""` |

| bucket | The AWS bucket to watch. | `""` |

| prefix | The bucket prefix. | `""` |

| suffix | Filename suffix to restrict files processed on the bucket. | `""` |

| clean-objects | Whether to delete S3 objects after processing them. | `false` |

| max-object-age | How long to wait since last modification before file cleaning. | `10m` |

| timeout | The global timeout. | `"30s"` |

| influx-server | The InfluxDB server address. | `""` |

| influx-token | The InfluxDB token. | `""` |

| influx-org | The InfluxDB org to write to. | `""` |

| influx-bucket | The InfluxDB bucket write to. | `""` |

| measurement | A measurement acts as a container for tags, fields, and timestamps. Use a measurement name that describes your data. | `""` |

| timestamp-row | The timestamp row in CSV. | `"timestamp"` |

| timestamp-layout | The layout to parse timestamp. | `"2006-01-02T15:04:05.000Z"` |

| tag | Tags to add to InfluxDB point. Could be of the form `--tag=foo` if tag name matches CSV row or `--tag='foo={row:bar}'` to specify row. | `""` |

| field | Fields to add to InfluxDB point. Could be of the form `--field='foo={type:int,row:bar}'`, if not specified, CSV row matches field name. Type can be float, int, string or bool. | `""` |

| max-routines | The max number of concurrent object processing routines. | `100` |

## License

Distributed under the Apache 2.0 License. See `LICENSE` for more information.

## Versioning

We use [SemVer](http://semver.org/) for versioning.

## Help

Got a question?

File a GitHub [issue](https://github.com/quortex/influxdb-athena-crawler/issues).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/quortex/influxdb-athena-crawler

Awesome Lists containing this project

README