Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/scraly/flume-bigquery-sink
An Apache Flume Sink implementation to publish data to Google BigQuery
https://github.com/scraly/flume-bigquery-sink
bigquery flume sink
Last synced: about 1 month ago
JSON representation
An Apache Flume Sink implementation to publish data to Google BigQuery
- Host: GitHub
- URL: https://github.com/scraly/flume-bigquery-sink
- Owner: scraly
- Created: 2017-03-24T21:07:37.000Z (almost 8 years ago)
- Default Branch: master
- Last Pushed: 2017-03-26T18:45:51.000Z (almost 8 years ago)
- Last Synced: 2024-11-06T07:38:59.815Z (3 months ago)
- Topics: bigquery, flume, sink
- Language: Java
- Size: 21.5 KB
- Stars: 0
- Watchers: 3
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# flume-bigquery-sink
An Apache Flume Sink implementation to publish data to Google BigQuery## Configuration of Google BigQuery Sink:
>Edit log4j.xml
...
>Edit your flume.conf:
#list the sources, sinks and channels for the agent
agent.sources =
agent.sinks = bigquery-sink
agent.channels = bigquery-channel
# properties of
agent.sources.modv6-source.channels = bigquery-channel
agent.sources.modv6-source.type = avro
agent.sources.modv6-source.bind = localhost
agent.sources.modv6-source.port = 8090
# properties of bigquery-channel
agent.channels.bigquery-channel.type = file
agent.channels.bigquery-channel.checkpointDir = /data/flume-bq/checkpoint
agent.channels.bigquery-channel.dataDirs = /data/flume-bq/data
agent.channels.bigquery-channel.minimumRequiredSpace = 0
# properties of bigquery-sink
agent.sinks.bigquery-sink.channel =
agent.sinks.bigquery-sink.type = BigQuerySink
agent.sinks.bigquery-sink.batchSize = 100
agent.sinks.bigquery-sink.clientId = .apps.googleusercontent.com
agent.sinks.bigquery-sink.clientSecret =
agent.sinks.bigquery-sink.accessToken =
agent.sinks.bigquery-sink.refreshToken =
agent.sinks.bigquery-sink.dataStoreDir = /home//etc/
agent.sinks.bigquery-sink.userId =
agent.sinks.bigquery-sink.datasetId =
agent.sinks.bigquery-sink.projectId =
>Edit BigQueryManager class:private static final String PROJECT_ID = "112233445566"; // change with your google cloud projectId
private static final String DATASET = "toto"; //change with your google bigquery dataset
>Change LogField and CSVUtil classes in order to tell to the BigQuery sink what is the bq table schema