{"id":13452379,"url":"https://github.com/fluent/fluent-plugin-kafka","last_synced_at":"2025-03-23T19:34:10.468Z","repository":{"id":13481633,"uuid":"16171900","full_name":"fluent/fluent-plugin-kafka","owner":"fluent","description":"Kafka input and output plugin for Fluentd","archived":false,"fork":false,"pushed_at":"2024-04-01T04:38:01.000Z","size":609,"stargazers_count":299,"open_issues_count":31,"forks_count":175,"subscribers_count":32,"default_branch":"master","last_synced_at":"2024-04-26T20:21:08.621Z","etag":null,"topics":["fluentd","fluentd-plugin","kafka"],"latest_commit_sha":null,"homepage":"","language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fluent.png","metadata":{"files":{"readme":"README.md","changelog":"ChangeLog","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2014-01-23T12:34:26.000Z","updated_at":"2024-06-18T13:38:56.484Z","dependencies_parsed_at":"2024-06-18T13:38:54.506Z","dependency_job_id":"4a42ef55-248a-42d8-871b-e3e7deacd57f","html_url":"https://github.com/fluent/fluent-plugin-kafka","commit_stats":{"total_commits":463,"total_committers":81,"mean_commits":5.716049382716049,"dds":0.5982721382289418,"last_synced_commit":"6f22abb7f5c2a3f0627c997574404703dfc6c9e1"},"previous_names":[],"tags_count":109,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fluent%2Ffluent-plugin-kafka","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fluent%2Ffluent-plugin-kafka/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fluent%2Ffluent-plugin-kafka/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fluent%2Ffluent-plugin-kafka/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fluent","download_url":"https://codeload.github.com/fluent/fluent-plugin-kafka/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245159335,"owners_count":20570363,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["fluentd","fluentd-plugin","kafka"],"created_at":"2024-07-31T07:01:22.423Z","updated_at":"2025-03-23T19:34:10.055Z","avatar_url":"https://github.com/fluent.png","language":"Ruby","readme":"# fluent-plugin-kafka, a plugin for [Fluentd](http://fluentd.org)\n\n[![GitHub Actions Status](https://github.com/fluent/fluent-plugin-kafka/actions/workflows/linux.yml/badge.svg)](https://github.com/fluent/fluent-plugin-kafka/actions/workflows/linux.yml)\n\n\nA fluentd plugin to both consume and produce data for Apache Kafka.\n\n## Installation\n\nAdd this line to your application's Gemfile:\n\n    gem 'fluent-plugin-kafka'\n\nAnd then execute:\n\n    $ bundle\n\nOr install it yourself as:\n\n    $ gem install fluent-plugin-kafka --no-document\n\nIf you want to use zookeeper related parameters, you also need to install zookeeper gem. zookeeper gem includes native extension, so development tools are needed, e.g. ruby-devel, gcc, make and etc.\n\n## Requirements\n\n- Ruby 2.1 or later\n- Input plugins work with kafka v0.9 or later\n- Output plugins work with kafka v0.8 or later\n\n## Usage\n\n### Common parameters\n\n#### SSL authentication\n\n- ssl_ca_cert\n- ssl_client_cert\n- ssl_client_cert_key\n- ssl_client_cert_key_password\n- ssl_ca_certs_from_system\n\nSet path to SSL related files. See [Encryption and Authentication using SSL](https://github.com/zendesk/ruby-kafka#encryption-and-authentication-using-ssl) for more detail.\n\n#### SASL authentication\n\n##### with GSSAPI\n\n- principal\n- keytab\n\nSet principal and path to keytab for SASL/GSSAPI authentication.\nSee [Authentication using SASL](https://github.com/zendesk/ruby-kafka#authentication-using-sasl) for more details.\n\n##### with Plain/SCRAM\n\n- username\n- password\n- scram_mechanism\n- sasl_over_ssl\n\nSet username, password, scram_mechanism and sasl_over_ssl for SASL/Plain or Scram authentication.\nSee [Authentication using SASL](https://github.com/zendesk/ruby-kafka#authentication-using-sasl) for more details.\n\n### Input plugin (@type 'kafka')\n\nConsume events by single consumer.\n\n    \u003csource\u003e\n      @type kafka\n\n      brokers \u003cbroker1_host\u003e:\u003cbroker1_port\u003e,\u003cbroker2_host\u003e:\u003cbroker2_port\u003e,..\n      topics \u003clistening topics(separate with comma',')\u003e\n      format \u003cinput text type (text|json|ltsv|msgpack)\u003e :default =\u003e json\n      message_key \u003ckey (Optional, for text format only, default is message)\u003e\n      add_prefix \u003ctag prefix (Optional)\u003e\n      add_suffix \u003ctag suffix (Optional)\u003e\n\n      # Optionally, you can manage topic offset by using zookeeper\n      offset_zookeeper    \u003czookeer node list (\u003czookeeper1_host\u003e:\u003czookeeper1_port\u003e,\u003czookeeper2_host\u003e:\u003czookeeper2_port\u003e,..)\u003e\n      offset_zk_root_node \u003coffset path in zookeeper\u003e default =\u003e '/fluent-plugin-kafka'\n\n      # ruby-kafka consumer options\n      max_bytes     (integer) :default =\u003e nil (Use default of ruby-kafka)\n      max_wait_time (integer) :default =\u003e nil (Use default of ruby-kafka)\n      min_bytes     (integer) :default =\u003e nil (Use default of ruby-kafka)\n    \u003c/source\u003e\n\nSupports a start of processing from the assigned offset for specific topics.\n\n    \u003csource\u003e\n      @type kafka\n\n      brokers \u003cbroker1_host\u003e:\u003cbroker1_port\u003e,\u003cbroker2_host\u003e:\u003cbroker2_port\u003e,..\n      format \u003cinput text type (text|json|ltsv|msgpack)\u003e\n      \u003ctopic\u003e\n        topic     \u003clistening topic\u003e\n        partition \u003clistening partition: default=0\u003e\n        offset    \u003clistening start offset: default=-1\u003e\n      \u003c/topic\u003e\n      \u003ctopic\u003e\n        topic     \u003clistening topic\u003e\n        partition \u003clistening partition: default=0\u003e\n        offset    \u003clistening start offset: default=-1\u003e\n      \u003c/topic\u003e\n    \u003c/source\u003e\n\nSee also [ruby-kafka README](https://github.com/zendesk/ruby-kafka#consuming-messages-from-kafka) for more detailed documentation about ruby-kafka.\n\nConsuming topic name is used for event tag. So when the target topic name is `app_event`, the tag is `app_event`. If you want to modify tag, use `add_prefix` or `add_suffix` parameters. With `add_prefix kafka`, the tag is `kafka.app_event`.\n\n### Input plugin (@type 'kafka_group', supports kafka group)\n\nConsume events by kafka consumer group features..\n\n    \u003csource\u003e\n      @type kafka_group\n\n      brokers \u003cbroker1_host\u003e:\u003cbroker1_port\u003e,\u003cbroker2_host\u003e:\u003cbroker2_port\u003e,..\n      consumer_group \u003cconsumer group name, must set\u003e\n      topics \u003clistening topics(separate with comma',')\u003e\n      format \u003cinput text type (text|json|ltsv|msgpack)\u003e :default =\u003e json\n      message_key \u003ckey (Optional, for text format only, default is message)\u003e\n      kafka_message_key \u003ckey (Optional, If specified, set kafka's message key to this key)\u003e\n      add_headers \u003cIf true, add kafka's message headers to record\u003e\n      add_prefix \u003ctag prefix (Optional)\u003e\n      add_suffix \u003ctag suffix (Optional)\u003e\n      retry_emit_limit \u003cWait retry_emit_limit x 1s when BuffereQueueLimitError happens. The default is nil and it means waiting until BufferQueueLimitError is resolved\u003e\n      use_record_time (Deprecated. Use 'time_source record' instead.) \u003cIf true, replace event time with contents of 'time' field of fetched record\u003e\n      time_source \u003csource for message timestamp (now|kafka|record)\u003e :default =\u003e now\n      time_format \u003cstring (Optional when use_record_time is used)\u003e\n\n      # ruby-kafka consumer options\n      max_bytes               (integer) :default =\u003e 1048576\n      max_wait_time           (integer) :default =\u003e nil (Use default of ruby-kafka)\n      min_bytes               (integer) :default =\u003e nil (Use default of ruby-kafka)\n      offset_commit_interval  (integer) :default =\u003e nil (Use default of ruby-kafka)\n      offset_commit_threshold (integer) :default =\u003e nil (Use default of ruby-kafka)\n      fetcher_max_queue_size  (integer) :default =\u003e nil (Use default of ruby-kafka)\n      refresh_topic_interval  (integer) :default =\u003e nil (Use default of ruby-kafka)\n      start_from_beginning    (bool)    :default =\u003e true\n    \u003c/source\u003e\n\nSee also [ruby-kafka README](https://github.com/zendesk/ruby-kafka#consuming-messages-from-kafka) for more detailed documentation about ruby-kafka options.\n\n`topics` supports regex pattern since v0.13.1. If you want to use regex pattern, use `/pattern/` like `/foo.*/`.\n\nConsuming topic name is used for event tag. So when the target topic name is `app_event`, the tag is `app_event`. If you want to modify tag, use `add_prefix` or `add_suffix` parameter. With `add_prefix kafka`, the tag is `kafka.app_event`.\n\n### Input plugin (@type 'rdkafka_group', supports kafka consumer groups, uses rdkafka-ruby)\n\n:warning: **The in_rdkafka_group consumer was not yet tested under heavy production load. Use it at your own risk!**\n\nWith the introduction of the rdkafka-ruby based input plugin we hope to support Kafka brokers above version 2.1 where we saw [compatibility issues](https://github.com/fluent/fluent-plugin-kafka/issues/315) when using the ruby-kafka based @kafka_group input type. The rdkafka-ruby lib wraps the highly performant and production ready librdkafka C lib.\n\n    \u003csource\u003e\n      @type rdkafka_group\n      topics \u003clistening topics(separate with comma',')\u003e\n      format \u003cinput text type (text|json|ltsv|msgpack)\u003e :default =\u003e json\n      message_key \u003ckey (Optional, for text format only, default is message)\u003e\n      kafka_message_key \u003ckey (Optional, If specified, set kafka's message key to this key)\u003e\n      add_headers \u003cIf true, add kafka's message headers to record\u003e\n      add_prefix \u003ctag prefix (Optional)\u003e\n      add_suffix \u003ctag suffix (Optional)\u003e\n      retry_emit_limit \u003cWait retry_emit_limit x 1s when BuffereQueueLimitError happens. The default is nil and it means waiting until BufferQueueLimitError is resolved\u003e\n      use_record_time (Deprecated. Use 'time_source record' instead.) \u003cIf true, replace event time with contents of 'time' field of fetched record\u003e\n      time_source \u003csource for message timestamp (now|kafka|record)\u003e :default =\u003e now\n      time_format \u003cstring (Optional when use_record_time is used)\u003e\n\n      # kafka consumer options\n      max_wait_time_ms 500\n      max_batch_size 10000\n      kafka_configs {\n        \"bootstrap.servers\": \"brokers \u003cbroker1_host\u003e:\u003cbroker1_port\u003e,\u003cbroker2_host\u003e:\u003cbroker2_port\u003e\",\n        \"group.id\": \"\u003cconsumer group name\u003e\"\n      }\n    \u003c/source\u003e\n\nSee also [rdkafka-ruby](https://github.com/appsignal/rdkafka-ruby) and [librdkafka](https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md) for more detailed documentation about Kafka consumer options.\n\nConsuming topic name is used for event tag. So when the target topic name is `app_event`, the tag is `app_event`. If you want to modify tag, use `add_prefix` or `add_suffix` parameter. With `add_prefix kafka`, the tag is `kafka.app_event`.\n\n### Output plugin\n\nThis `kafka2` plugin is for fluentd v1 or later. This plugin uses `ruby-kafka` producer for writing data.\nIf `ruby-kafka` doesn't fit your kafka environment, check `rdkafka2` plugin instead. This will be `out_kafka` plugin in the future.\n\n    \u003cmatch app.**\u003e\n      @type kafka2\n\n      brokers               \u003cbroker1_host\u003e:\u003cbroker1_port\u003e,\u003cbroker2_host\u003e:\u003cbroker2_port\u003e,.. # Set brokers directly\n\n      # Kafka topic, placerholders are supported. Chunk keys are required in the Buffer section inorder for placeholders\n      # to work.\n      topic                 (string) :default =\u003e nil\n      topic_key             (string) :default =\u003e 'topic'\n      partition_key         (string) :default =\u003e 'partition'\n      partition_key_key     (string) :default =\u003e 'partition_key'\n      message_key_key       (string) :default =\u003e 'message_key'\n      default_topic         (string) :default =\u003e nil\n      default_partition_key (string) :default =\u003e nil\n      record_key            (string) :default =\u003e nil\n      default_message_key   (string) :default =\u003e nil\n      exclude_topic_key     (bool)   :default =\u003e false\n      exclude_partition_key (bool)   :default =\u003e false\n      exclude_partition     (bool)   :default =\u003e false\n      exclude_message_key   (bool)   :default =\u003e false\n      get_kafka_client_log  (bool)   :default =\u003e false\n      headers               (hash)   :default =\u003e {}\n      headers_from_record   (hash)   :default =\u003e {}\n      use_event_time        (bool)   :default =\u003e false\n      use_default_for_unknown_topic (bool) :default =\u003e false\n      discard_kafka_delivery_failed (bool) :default =\u003e false (No discard)\n      partitioner_hash_function (enum) (crc32|murmur2) :default =\u003e 'crc32'\n      share_producer        (bool)   :default =\u003e false\n      idempotent            (bool)   :default =\u003e false\n\n      # If you intend to rely on AWS IAM auth to MSK with long lived credentials\n      # https://docs.aws.amazon.com/msk/latest/developerguide/iam-access-control.html\n      #\n      # For AWS STS support, see status in\n      # - https://github.com/zendesk/ruby-kafka/issues/944\n      # - https://github.com/zendesk/ruby-kafka/pull/951\n      sasl_aws_msk_iam_access_key_id (string) :default =\u003e nil\n      sasl_aws_msk_iam_secret_key_id (string) :default =\u003e nil\n      sasl_aws_msk_iam_aws_region    (string) :default =\u003e nil\n\n      \u003cformat\u003e\n        @type (json|ltsv|msgpack|attr:\u003crecord name\u003e|\u003cformatter name\u003e) :default =\u003e json\n      \u003c/format\u003e\n\n      # Optional. See https://docs.fluentd.org/v/1.0/configuration/inject-section\n      \u003cinject\u003e\n        tag_key tag\n        time_key time\n      \u003c/inject\u003e\n\n      # See fluentd document for buffer related parameters: https://docs.fluentd.org/v/1.0/configuration/buffer-section\n      # Buffer chunk key should be same with topic_key. If value is not found in the record, default_topic is used.\n      \u003cbuffer topic\u003e\n        flush_interval 10s\n      \u003c/buffer\u003e\n\n      # ruby-kafka producer options\n      idempotent        (bool)    :default =\u003e false\n      sasl_over_ssl     (bool)    :default =\u003e true\n      max_send_retries  (integer) :default =\u003e 1\n      required_acks     (integer) :default =\u003e -1\n      ack_timeout       (integer) :default =\u003e nil (Use default of ruby-kafka)\n      compression_codec (string)  :default =\u003e nil (No compression. Depends on ruby-kafka: https://github.com/zendesk/ruby-kafka#compression)\n    \u003c/match\u003e\n\nThe `\u003cformatter name\u003e` in `\u003cformat\u003e` uses fluentd's formatter plugins. See [formatter article](https://docs.fluentd.org/v/1.0/formatter).\n\n**Note:** Java based Kafka client uses `murmur2` as partitioner function by default. If you want to use same partitioning behavior with fluent-plugin-kafka, change it to `murmur2` instead of `crc32`. Note that for using `murmur2` hash partitioner function, you must install `digest-murmurhash` gem.\n\nruby-kafka sometimes returns `Kafka::DeliveryFailed` error without good information.\nIn this case, `get_kafka_client_log` is useful for identifying the error cause.\nruby-kafka's log is routed to fluentd log so you can see ruby-kafka's log in fluentd logs.\n\nSupports following ruby-kafka's producer options.\n\n- max_send_retries - default: 2 - Number of times to retry sending of messages to a leader.\n- required_acks - default: -1 - The number of acks required per request. If you need flush performance, set lower value, e.g. 1, 2.\n- ack_timeout - default: nil - How long the producer waits for acks. The unit is seconds.\n- compression_codec - default: nil - The codec the producer uses to compress messages.\n- max_send_limit_bytes - default: nil - Max byte size to send message to avoid MessageSizeTooLarge. For example, if you set 1000000(message.max.bytes in kafka), Message more than 1000000 byes will be dropped.\n- discard_kafka_delivery_failed - default: false - discard the record where [Kafka::DeliveryFailed](http://www.rubydoc.info/gems/ruby-kafka/Kafka/DeliveryFailed) occurred\n\nIf you want to know about detail of monitoring, see also https://github.com/zendesk/ruby-kafka#monitoring\n\nSee also [Kafka::Client](http://www.rubydoc.info/gems/ruby-kafka/Kafka/Client) for more detailed documentation about ruby-kafka.\n\nThis plugin supports compression codec \"snappy\" also.\nInstall snappy module before you use snappy compression.\n\n    $ gem install snappy --no-document\n\nsnappy gem uses native extension, so you need to install several packages before.\nOn Ubuntu, need development packages and snappy library.\n\n    $ sudo apt-get install build-essential autoconf automake libtool libsnappy-dev\n\nOn CentOS 7 installation is also necessary.\n\n    $ sudo yum install gcc autoconf automake libtool snappy-devel\n\nThis plugin supports compression codec \"lz4\" also.\nInstall extlz4 module before you use lz4 compression.\n\n    $ gem install extlz4 --no-document\n\nThis plugin supports compression codec \"zstd\" also.\nInstall zstd-ruby module before you use zstd compression.\n\n    $ gem install zstd-ruby --no-document\n\n#### Load balancing\n\nMessages will be assigned a partition at random as default by ruby-kafka, but messages with the same partition key will always be assigned to the same partition by setting `default_partition_key` in config file.\nIf key name `partition_key_key` exists in a message, this plugin set the value of partition_key_key as key.\n\n|default_partition_key|partition_key_key| behavior |\n| --- | --- | --- |\n|Not set|Not exists| All messages are assigned a partition at random |\n|Set| Not exists| All messages are assigned to the specific partition |\n|Not set| Exists | Messages which have partition_key_key record are assigned to the specific partition, others are assigned a partition at random |\n|Set| Exists | Messages which have partition_key_key record are assigned to the specific partition with partition_key_key, others are assigned to the specific partition with default_parition_key |\n\nIf key name `message_key_key` exists in a message, this plugin publishes the value of message_key_key to kafka and can be read by consumers. Same message key will be assigned to all messages by setting `default_message_key` in config file. If message_key_key exists and if partition_key_key is not set explicitly, messsage_key_key will be used for partitioning.\n\n#### Headers\nIt is possible to set headers on Kafka messages. This only works for kafka2 and rdkafka2 output plugin.\n\nThe format is like key1:value1,key2:value2. For example:\n\n    \u003cmatch app.**\u003e\n      @type kafka2\n      [...]\n      headers some_header_name:some_header_value\n    \u003cmatch\u003e\n\nYou may set header values based on a value of a fluentd record field. For example, imagine a fluentd record like:\n\n    {\"source\": { \"ip\": \"127.0.0.1\" }, \"payload\": \"hello world\" }\n\nAnd the following fluentd config:\n\n    \u003cmatch app.**\u003e\n      @type kafka2\n      [...]\n      headers_from_record source_ip:$.source.ip\n    \u003cmatch\u003e\n\nThe Kafka message will have a header of source_ip=12.7.0.0.1.\n\nThe configuration format is jsonpath. It is descibed in https://docs.fluentd.org/plugin-helper-overview/api-plugin-helper-record_accessor\n\n#### Excluding fields\nFields can be excluded from output data. Only works for kafka2 and rdkafka2 output plugin.\n\nFields must be specified using an array of dot notation `$.`, for example:\n\n    \u003cmatch app.**\u003e\n      @type kafka2\n      [...]\n      exclude_fields $.source.ip,$.HTTP_FOO\n    \u003cmatch\u003e\n\nThis config can be used to remove fields used on another configs.\n\nFor example, `$.source.ip` can be extracted with config `headers_from_record` and excluded from message payload.\n\n\u003e Using this config to remove unused fields is discouraged. A [filter plugin](https://docs.fluentd.org/v/0.12/filter) can be used for this purpose.\n\n#### Send only a sub field as a message payload\n\nIf `record_key` is provided, the plugin sends only a sub field given by that key.\nThe configuration format is jsonpath.\n\ne.g. When the following configuration and the incoming record are given:\n\nconfiguration:\n\n    \u003cmatch **\u003e\n      @type kafka2\n      [...]\n      record_key '$.data'\n    \u003c/match\u003e\n\nrecord:\n\n    {\n        \"specversion\" : \"1.0\",\n        \"type\" : \"com.example.someevent\",\n        \"id\" : \"C234-1234-1234\",\n        \"time\" : \"2018-04-05T17:31:00Z\",\n        \"datacontenttype\" : \"application/json\",\n        \"data\" : {\n            \"appinfoA\" : \"abc\",\n            \"appinfoB\" : 123,\n            \"appinfoC\" : true\n        },\n        ...\n    }\n\nonly the `data` field will be serialized by the formatter and sent to Kafka.\nThe toplevel `data` key will be removed.\n\n### Buffered output plugin\n\nThis plugin uses ruby-kafka producer for writing data. This plugin is for v0.12. If you use v1, see `kafka2`.\nSupport of fluentd v0.12 has ended. `kafka_buffered` will be an alias of `kafka2` and will be removed in the future.\n\n    \u003cmatch app.**\u003e\n      @type kafka_buffered\n\n      # Brokers: you can choose either brokers or zookeeper. If you are not familiar with zookeeper, use brokers parameters.\n      brokers             \u003cbroker1_host\u003e:\u003cbroker1_port\u003e,\u003cbroker2_host\u003e:\u003cbroker2_port\u003e,.. # Set brokers directly\n      zookeeper           \u003czookeeper_host\u003e:\u003czookeeper_port\u003e # Set brokers via Zookeeper\n      zookeeper_path      \u003cbroker path in zookeeper\u003e :default =\u003e /brokers/ids # Set path in zookeeper for kafka\n\n      topic_key             (string) :default =\u003e 'topic'\n      partition_key         (string) :default =\u003e 'partition'\n      partition_key_key     (string) :default =\u003e 'partition_key'\n      message_key_key       (string) :default =\u003e 'message_key'\n      default_topic         (string) :default =\u003e nil\n      default_partition_key (string) :default =\u003e nil\n      default_message_key   (string) :default =\u003e nil\n      exclude_topic_key     (bool)   :default =\u003e false\n      exclude_partition_key (bool)   :default =\u003e false\n      exclude_partition     (bool)   :default =\u003e false\n      exclude_message_key   (bool)   :default =\u003e false\n      output_data_type      (json|ltsv|msgpack|attr:\u003crecord name\u003e|\u003cformatter name\u003e) :default =\u003e json\n      output_include_tag    (bool) :default =\u003e false\n      output_include_time   (bool) :default =\u003e false\n      exclude_topic_key     (bool) :default =\u003e false\n      exclude_partition_key (bool) :default =\u003e false\n      get_kafka_client_log  (bool) :default =\u003e false\n      use_event_time        (bool) :default =\u003e false\n      partitioner_hash_function (enum) (crc32|murmur2) :default =\u003e 'crc32'\n\n      # See fluentd document for buffer related parameters: https://docs.fluentd.org/v/0.12/buffer\n\n      # ruby-kafka producer options\n      idempotent                   (bool)    :default =\u003e false\n      sasl_over_ssl                (bool)    :default =\u003e true\n      max_send_retries             (integer) :default =\u003e 1\n      required_acks                (integer) :default =\u003e -1\n      ack_timeout                  (integer) :default =\u003e nil (Use default of ruby-kafka)\n      compression_codec            (string)  :default =\u003e nil (No compression. Depends on ruby-kafka: https://github.com/zendesk/ruby-kafka#compression)\n      kafka_agg_max_bytes          (integer) :default =\u003e 4096\n      kafka_agg_max_messages       (integer) :default =\u003e nil (No limit)\n      max_send_limit_bytes         (integer) :default =\u003e nil (No drop)\n      discard_kafka_delivery_failed   (bool) :default =\u003e false (No discard)\n      monitoring_list              (array)   :default =\u003e []\n    \u003c/match\u003e\n\n`kafka_buffered` supports the following `ruby-kafka` parameters:\n\n- max_send_retries - default: 2 - Number of times to retry sending of messages to a leader.\n- required_acks - default: -1 - The number of acks required per request. If you need flush performance, set lower value, e.g. 1, 2.\n- ack_timeout - default: nil - How long the producer waits for acks. The unit is seconds.\n- compression_codec - default: nil - The codec the producer uses to compress messages.\n- max_send_limit_bytes - default: nil - Max byte size to send message to avoid MessageSizeTooLarge. For example, if you set 1000000(message.max.bytes in kafka), Message more than 1000000 byes will be dropped.\n- discard_kafka_delivery_failed - default: false - discard the record where [Kafka::DeliveryFailed](http://www.rubydoc.info/gems/ruby-kafka/Kafka/DeliveryFailed) occurred\n- monitoring_list - default: [] - library to be used to monitor. statsd and datadog are supported\n\n`kafka_buffered` has two additional parameters:\n\n- kafka_agg_max_bytes - default: 4096 - Maximum value of total message size to be included in one batch transmission.\n- kafka_agg_max_messages - default: nil - Maximum number of messages to include in one batch transmission.\n\n**Note:** Java based Kafka client uses `murmur2` as partitioner function by default. If you want to use same partitioning behavior with fluent-plugin-kafka, change it to `murmur2` instead of `crc32`. Note that for using `murmur2` hash partitioner function, you must install `digest-murmurhash` gem.\n\n### Non-buffered output plugin\n\nThis plugin uses ruby-kafka producer for writing data. For performance and reliability concerns, use `kafka_bufferd` output instead. This is mainly for testing.\n\n    \u003cmatch app.**\u003e\n      @type kafka\n\n      # Brokers: you can choose either brokers or zookeeper.\n      brokers        \u003cbroker1_host\u003e:\u003cbroker1_port\u003e,\u003cbroker2_host\u003e:\u003cbroker2_port\u003e,.. # Set brokers directly\n      zookeeper      \u003czookeeper_host\u003e:\u003czookeeper_port\u003e # Set brokers via Zookeeper\n      zookeeper_path \u003cbroker path in zookeeper\u003e :default =\u003e /brokers/ids # Set path in zookeeper for kafka\n\n      default_topic         (string) :default =\u003e nil\n      default_partition_key (string) :default =\u003e nil\n      default_message_key   (string) :default =\u003e nil\n      output_data_type      (json|ltsv|msgpack|attr:\u003crecord name\u003e|\u003cformatter name\u003e) :default =\u003e json\n      output_include_tag    (bool) :default =\u003e false\n      output_include_time   (bool) :default =\u003e false\n      exclude_topic_key     (bool) :default =\u003e false\n      exclude_partition_key (bool) :default =\u003e false\n      partitioner_hash_function (enum) (crc32|murmur2) :default =\u003e 'crc32'\n\n      # ruby-kafka producer options\n      max_send_retries    (integer) :default =\u003e 1\n      required_acks       (integer) :default =\u003e -1\n      ack_timeout         (integer) :default =\u003e nil (Use default of ruby-kafka)\n      compression_codec   (string)  :default =\u003e nil (No compression. Depends on ruby-kafka: https://github.com/zendesk/ruby-kafka#compression)\n      max_buffer_size     (integer) :default =\u003e nil (Use default of ruby-kafka)\n      max_buffer_bytesize (integer) :default =\u003e nil (Use default of ruby-kafka)\n    \u003c/match\u003e\n\nThis plugin also supports ruby-kafka related parameters. See Buffered output plugin section.\n\n**Note:** Java based Kafka client uses `murmur2` as partitioner function by default. If you want to use same partitioning behavior with fluent-plugin-kafka, change it to `murmur2` instead of `crc32`. Note that for using `murmur2` hash partitioner function, you must install `digest-murmurhash` gem.\n\n### rdkafka based output plugin\n\nThis plugin uses `rdkafka` instead of `ruby-kafka` for kafka client.\nYou need to install rdkafka gem.\n\n    # rdkafka is C extension library. Need to install development tools like ruby-devel, gcc and etc\n    # for v0.12 or later\n    $ gem install rdkafka --no-document\n    # for v0.11 or earlier\n    $ gem install rdkafka -v 0.6.0 --no-document\n\n`rdkafka2` is for fluentd v1.0 or later.\n\n    \u003cmatch app.**\u003e\n      @type rdkafka2\n\n      brokers \u003cbroker1_host\u003e:\u003cbroker1_port\u003e,\u003cbroker2_host\u003e:\u003cbroker2_port\u003e,.. # Set brokers directly\n\n      topic_key             (string) :default =\u003e 'topic'\n      default_topic         (string) :default =\u003e nil\n      partition_key         (string) :default =\u003e 'partition'\n      partition_key_key     (string) :default =\u003e 'partition_key'\n      message_key_key       (string) :default =\u003e 'message_key'\n      use_default_for_unknown_topic           (bool) :default =\u003e false\n      use_default_for_unknown_partition_error (bool) :default =\u003e false\n      default_partition_key (string) :default =\u003e nil\n      default_message_key   (string) :default =\u003e nil\n      exclude_topic_key     (bool) :default =\u003e false\n      exclude_partition_key (bool) :default =\u003e false\n      discard_kafka_delivery_failed (bool) :default =\u003e false (No discard)\n      discard_kafka_delivery_failed_regex (regexp) :default =\u003e nil (No discard)\n      use_event_time        (bool) :default =\u003e false\n\n      # same with kafka2\n      headers               (hash) :default =\u003e {}\n      headers_from_record   (hash) :default =\u003e {}\n      record_key            (string) :default =\u003e nil\n\n      \u003cformat\u003e\n        @type (json|ltsv|msgpack|attr:\u003crecord name\u003e|\u003cformatter name\u003e) :default =\u003e json\n      \u003c/format\u003e\n\n      # Optional. See https://docs.fluentd.org/v/1.0/configuration/inject-section\n      \u003cinject\u003e\n        tag_key tag\n        time_key time\n      \u003c/inject\u003e\n\n      # See fluentd document for buffer section parameters: https://docs.fluentd.org/v/1.0/configuration/buffer-section\n      # Buffer chunk key should be same with topic_key. If value is not found in the record, default_topic is used.\n      \u003cbuffer topic\u003e\n        flush_interval 10s\n      \u003c/buffer\u003e\n\n      # You can set any rdkafka configuration via this parameter: https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md\n      rdkafka_options {\n        \"log_level\" : 7\n      }\n\n      # rdkafka2 specific parameters\n\n      # share kafka producer between flush threads. This is mainly for reducing kafka operations like kerberos\n      share_producer (bool) :default =\u003e false\n      # Timeout for polling message wait. If 0, no wait.\n      rdkafka_delivery_handle_poll_timeout (integer) :default =\u003e 30\n      # If the record size is larger than this value, such records are ignored. Default is no limit\n      max_send_limit_bytes (integer) :default =\u003e nil\n      # The maximum number of enqueueing bytes per second. It can reduce the\n      # load of both Fluentd and Kafka when excessive messages are attempted\n      # to send. Default is no limit.\n      max_enqueue_bytes_per_second (integer) :default =\u003e nil\n      unrecoverable_error_codes (array) :default =\u003e [\"topic_authorization_failed\", \"msg_size_too_large\"]\n\n    \u003c/match\u003e\n\n`rdkafka2` supports `discard_kafka_delivery_failed_regex` parameter:\n- `discard_kafka_delivery_failed_regex` - default: nil - discard the record where the Kafka::DeliveryFailed occurred and the emitted message matches the given regex pattern, such as `/unknown_topic/`. \n\nIf you use v0.12, use `rdkafka` instead.\n\n    \u003cmatch kafka.**\u003e\n      @type rdkafka\n\n      default_topic kafka\n      flush_interval 1s\n      output_data_type json\n\n      rdkafka_options {\n        \"log_level\" : 7\n      }\n    \u003c/match\u003e\n\n## FAQ\n\n### Why fluent-plugin-kafka can't send data to our kafka cluster?\n\nWe got lots of similar questions. Almost cases, this problem happens by version mismatch between ruby-kafka and kafka cluster.\nSee ruby-kafka README for more details: https://github.com/zendesk/ruby-kafka#compatibility\n\nTo avoid the problem, there are 2 approaches:\n\n- Upgrade your kafka cluster to latest version. This is better because recent version is faster and robust.\n- Downgrade ruby-kafka/fluent-plugin-kafka to work with your older kafka.\n\n## Contributing\n\n1. Fork it\n2. Create your feature branch (`git checkout -b my-new-feature`)\n3. Commit your changes (`git commit -am 'Added some feature'`)\n4. Push to the branch (`git push origin my-new-feature`)\n5. Create new Pull Request\n","funding_links":[],"categories":["Ruby","Operations"],"sub_categories":["Monitoring"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffluent%2Ffluent-plugin-kafka","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffluent%2Ffluent-plugin-kafka","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffluent%2Ffluent-plugin-kafka/lists"}