https://github.com/fluent/fluent-plugin-sql
SQL input/output plugin for Fluentd
https://github.com/fluent/fluent-plugin-sql
activerecord fluentd fluentd-plugin rdbms sql
Last synced: 6 months ago
JSON representation
SQL input/output plugin for Fluentd
- Host: GitHub
- URL: https://github.com/fluent/fluent-plugin-sql
- Owner: fluent
- Created: 2012-08-17T03:43:44.000Z (over 13 years ago)
- Default Branch: master
- Last Pushed: 2024-03-22T08:25:44.000Z (almost 2 years ago)
- Last Synced: 2025-06-08T16:08:23.778Z (7 months ago)
- Topics: activerecord, fluentd, fluentd-plugin, rdbms, sql
- Language: Ruby
- Size: 137 KB
- Stars: 111
- Watchers: 20
- Forks: 60
- Open Issues: 53
-
Metadata Files:
- Readme: README.md
- Changelog: ChangeLog
Awesome Lists containing this project
README
# SQL input plugin for [Fluentd](http://fluentd.org) event collector
## Overview
This SQL plugin has two parts:
1. SQL **input** plugin reads records from RDBMSes periodically. An example use case would be getting "diffs" of a table (based on the "updated_at" field).
2. SQL **output** plugin that writes records into RDBMSes. An example use case would be aggregating server/app/sensor logs into RDBMS systems.
## Requirements
| fluent-plugin-sql | fluentd | ruby |
|-------------------|------------|--------|
| >= 1.0.0 | >= v0.14.4 | >= 2.1 |
| < 1.0.0 | < v0.14.0 | >= 1.9 |
NOTE: fluent-plugin-sql v2's buffer format is different from v1. If you update the plugin to v2, don't reuse v1's buffer.
## Installation
$ fluent-gem install fluent-plugin-sql --no-document
$ fluent-gem install pg --no-document # for postgresql
You should install actual RDBMS driver gem together. `pg` gem for postgresql adapter or `mysql2` gem for `mysql2` adapter. Other adapters supported by [ActiveRecord](https://github.com/rails/rails/tree/master/activerecord) should work.
We recommend that mysql2 gem is higher than `0.3.12` and pg gem is higher than `0.16.0`.
If you use ruby 2.1, use pg gem 0.21.0 (< 1.0.0) because ActiveRecord 5.1.4 or earlier doesn't support Ruby 2.1.
## Input: How It Works
This plugin runs following SQL periodically:
SELECT * FROM *table* WHERE *update\_column* > *last\_update\_column\_value* ORDER BY *update_column* ASC LIMIT 500
What you need to configure is *update\_column*. The column should be an incremental column (such as AUTO\_ INCREMENT primary key) so that this plugin reads newly INSERTed rows. Alternatively, you can use a column incremented every time when you update the row (such as `last_updated_at` column) so that this plugin reads the UPDATEd rows as well.
If you omit to set *update\_column* parameter, it uses primary key.
It stores last selected rows to a file (named *state\_file*) to not forget the last row when Fluentd restarts.
## Input: Configuration
@type sql
host rdb_host
port rdb_port
database rdb_database
adapter mysql2_or_postgresql_or_etc
username myusername
password mypassword
tag_prefix my.rdb # optional, but recommended
select_interval 60s # optional
select_limit 500 # optional
state_file /var/run/fluentd/sql_state
table table1
tag table1 # optional
update_column update_col1
time_column time_col2 # optional
table table2
tag table2 # optional
update_column updated_at
time_column updated_at # optional
time_format %Y-%m-%d %H:%M:%S.%6N # optional
# detects all tables instead of sections
#all_tables
* **host** RDBMS host
* **port** RDBMS port
* **database** RDBMS database name
* **adapter** RDBMS driver name. You should install corresponding gem before start (mysql2 gem for mysql2 adapter, pg gem for postgresql adapter, etc.)
* **username** RDBMS login user name
* **password** RDBMS login password
* **tag_prefix** prefix of tags of events. actual tag will be this\_tag\_prefix.tables\_tag (optional)
* **select_interval** interval to run SQLs (optional)
* **select_limit** LIMIT of number of rows for each SQL (optional)
* **state_file** path to a file to store last rows
* **all_tables** reads all tables instead of configuring each tables in \ sections
\ sections:
* **tag** tag name of events (optional; default value is table name)
* **table** RDBM table name
* **update_column**: see above description
* **time_column** (optional): if this option is set, this plugin uses this column's value as the the event's time. Otherwise it uses current time.
* **primary_key** (optional): if you want to get data from the table which doesn't have primary key like PostgreSQL's View, set this parameter.
* **time_format** (optional): if you want to specify the format of the date used in the query, useful when using alternative adapters which have restrictions on format
## Input: Limitation
You should make sure target tables have index (and/or partitions) on the *update\_column*. Otherwise SELECT causes full table scan and serious performance problem.
You can't replicate DELETEd rows.
## Output: How It Works
This plugin takes advantage of ActiveRecord underneath. For `host`, `port`, `database`, `adapter`, `username`, `password`, `socket` parameters, you can think of ActiveRecord's equivalent parameters.
## Output: Configuration
@type sql
host rdb_host
port 3306
database rdb_database
adapter mysql2_or_postgresql_or_etc
username myusername
password mypassword
socket path_to_socket
remove_tag_prefix my.rdb # optional, dual of tag_prefix in in_sql
table table1
column_mapping 'timestamp:created_at,fluentdata1:dbcol1,fluentdata2:dbcol2,fluentdata3:dbcol3'
# This is the default table because it has no "pattern" argument in
# The logic is such that if all non-default blocks
# do not match, the default one is chosen.
# The default table is required.
# You can pass the same pattern you use in match statements.
table table2
# This is the non-default table. It is chosen if the tag matches the pattern
# AFTER remove_tag_prefix is applied to the incoming event. For example, if
# the message comes in with the tag my.rdb.hello.world, "remove_tag_prefix my.rdb"
# makes it "hello.world", which gets matched here because of "pattern hello.*".
table table3
# This is the second non-default table. You can have as many non-default tables
# as you wish. One caveat: non-default tables are matched top-to-bottom and
# the events go into the first table it matches to. Hence, this particular table
# never gets any data, since the above "hello.*" subsumes "hello.world".
* **host** RDBMS host
* **port** RDBMS port
* **database** RDBMS database name
* **adapter** RDBMS driver name. You should install corresponding gem before start (mysql2 gem for mysql2 adapter, pg gem for postgresql adapter, etc.)
* **username** RDBMS login user name
* **password** RDBMS login password
* **socket** RDBMS socket path
* **pool** A connection pool synchronizes thread access to a limited number of database connections
* **timeout** RDBMS connection timeout
* **remove_tag_prefix** remove the given prefix from the events. See "tag_prefix" in "Input: Configuration". (optional)
\ sections:
* **table** RDBM table name
* **column_mapping**: [Required] Record to table schema mapping. The format is consists of `from:to` or `key` values are separated by `,`. For example, if set 'item_id:id,item_text:data,updated_at' to **column_mapping**, `item_id` field of record is stored into `id` column and `updated_at` field of record is stored into `updated_at` column.
* **\**: the pattern to which the incoming event's tag (after it goes through `remove_tag_prefix`, if given). The patterns should follow the same syntax as [that of \](https://docs.fluentd.org/configuration/config-file#how-match-patterns-work). **Exactly one \ element must NOT have this parameter so that it becomes the default table to store data**.