https://github.com/buoyant-data/hotdog

Hotdog is a syslog-to-Kafka forwarder which aims to get log entries into Apache Kafka as quickly as possible.
https://github.com/buoyant-data/hotdog
async kafka rust syslog
Last synced: about 2 months ago
JSON representation
Hotdog is a syslog-to-Kafka forwarder which aims to get log entries into Apache Kafka as quickly as possible.
Host: GitHub
URL: https://github.com/buoyant-data/hotdog
Owner: buoyant-data
License: agpl-3.0
Created: 2020-04-15T20:25:25.000Z (about 6 years ago)
Default Branch: main
Last Pushed: 2025-05-02T17:15:01.000Z (12 months ago)
Last Synced: 2025-10-14T11:17:26.778Z (6 months ago)
Topics: async, kafka, rust, syslog
Language: Rust
Homepage:
Size: 511 KB
Stars: 47
Watchers: 2
Forks: 8
Open Issues: 5
Metadata Files:
- Readme: README.adoc
- License: LICENSE.txt
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project

README

          ifdef::env-github[]

:tip-caption: :bulb:

:note-caption: :information_source:

:important-caption: :heavy_exclamation_mark:

:caution-caption: :fire:

:warning-caption: :warning:

endif::[]

:toc: macro

= 🌭 Hotdog!

Hotdog is a syslog-to-Kafka forwarder which aims to get log entries into

link:https://kafka.apache.org[Apache Kafka]

as quickly as possible.

It listens for syslog messages over plaintext or TLS-encrypted TCP connection

and depending on the defined <> it will route and even modify messages

on their way into a <>.

toc::[]

== Features

* syslog over plaintext or TLS-encrypted TCP connections.

* <> and <> for matching, modifying, and routing syslog

  messages based on the message content.

* Rich integration with Kafka with <> for

  link:https://github.com/edenhill/librdkafka[librdkafka]

* Built-in <> for daemon health reporting.

[source,bash]

----

Hotdog 1.0.0

R Tyler Croy 

Forward syslog with ease

USAGE:

    hotdog [OPTIONS]

FLAGS:

    -h, --help       Prints help information

    -V, --version    Prints version information

OPTIONS:

    -c, --config        Sets a custom config file [default: hotdog.yml]

    -t, --test     Test a log file against the configured rules

----

[[install]]

== Installation

Hotdog can be installed by grabbing a

link:https://github.com/reiseburo/hotdog/releases[released binary].

The system which will run `hotdog` *must* have `libsasl2` installed, e.g.:

.Ubuntu

[source,bash]

----

sudo apt-get install libsasl2-2

----

.openSUSE

[source,bash]

----

sudo zypper install cyrus-sasl-devel

----

[[performance]]

=== Performance

By default `hotdog` will run with a single background thread for processing

incoming messages. It is recommended to set `SMOL_THREADS` to the number of

CPUs which should be utilized on the machine.

[[configuration]]

== Configuration

Hotdog is configured by the `hotdog.yml` file, which has a very fluid syntax at

the moment. The two main sections are the `global` and `rules` blocks.

Rules defined in the configuration can be tested against an example log file in

order to verify that the right rules are matching the expected log inputs, for

example:

[source,bash]

----

❯ RUST_LOG=info ./target/debug/hotdog -t example.log

Line 1 matches on:

         - Regex: ^hello\s+(?P\w+)?

         - Regex: .*

Line 2 matches on:

         - Regex: .*

Line 3 matches on:

         - Regex: .*

Line 4 matches on:

         - JMESPath: meta.topic

         - Regex: .*

----

[[global]]

=== Global

The `global` configuration configures `hotdog` itself. The <>, <>, and <> keys are all

required by default in order for `hotdog` to start properly.

[[yml-listen]]

==== Listen

The `global.listen` configuration is required and will determine on which

address and port `hotdog` will listen. The <>

configuration key is required to function as well. When `tls` is left blank,

`hotdog` will listen for syslog messages in plaintext on the specified `port`.

.hotdog.yml

[source,yaml]

----

global:

  listen:

    address: '127.0.0.1'

    port: 1514

    tls:

----

[[yml-listen-tls]]

===== TLS

The `global.listen.tls` configuration section can be used to enable

syslog-over-TLS support from `hotdog`. Currently the only two valid keys for

this section are `cert` and `key`, both of which should be absolute or relative

paths to PEM-encoded files on disk.

Certificate and Key files can be created with `certtool --generate-privkey

--outfile ca-key.pem`

.hotdog.yml

[source,yaml]

----

global:

  listen:

    tls:

      cert: './a/path.crt'

      key: './a/path.key'

      # ca is optional and when provided will ensure certificate validation

      # happens

      ca: './a/ca.crt'

----

[[yml-status]]

==== Status

The `global.status` is an optional configuration entry which will enable the

launching of an HTTP status server on the specified `addresss` and `port`.

JSON formatted statistics can be retrieved on `/stats`.

.hotdog.yml

[source,yaml]

----

global:

  status:

    address: '127.0.0.1'

    port: 8585

----

[[yml-kafka]]

==== Kafka

A `global.kafka` configuration is required in order for `hotdog` to function

properly. The two main configuration values are <> and <>.

.hotdog.yml

[source,yaml]

----

global:

  kafka:

    conf:

      bootstrap.servers: 'localhost:9092'

      client.id: 'hotdog'

    topic: 'logs'

----

[[yml-kafka-buffer]]

===== Buffer

**Default:** `1024`

`global.kafka.buffer` may contain a number indicating the size of the internal

queue for sending messages to Kafka. This queue represents the number of

internal messages `hotdog` will buffer during Kafka availability issues.

This value is *not* the same as the librdkafka `queue.buffering.max.messages`

configuration, which governs the number of in-flight messages which can be sent

at any given time to the Kafka broker(s). To set that variable, include it in

the <> section documented below.

[CAUTION]

====

If the internal Kafka queue has been filled up, new log lines received by

`hotdog` will be discarded.

====

[[yml-kafka-conf]]

===== Conf

`global.kafka.conf` should contain a map of

link:https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md[librdkafka configuration values].

`hotdog` will expect every key _and_ value to be a String. These configuration

values are passed right on to the underlying librdkafka client connection, so

whatever librdkafka supports, `hotdog` supports!

[[yml-kafka-timeout_ms]]

===== timeout_ms

**Default:** `30_000`

`global.kafka.timeout_ms` is an optional configuration which defines the

timeout in milliseconds for `hotdog` to make an initial connection to the

configured Kafka brokers.

[[yml-kafka-topic]]

===== Topic

`global.kafka.topic` may contain a string value which is to be considered the

"default topic" for the <>.

[[yml-parquet]]

==== Parquet

The link:https://parquet.apache.org[Apache Parquet] sink allows for directly

writing to an

link:https://docs.rs/object_store/latest/object_store/index.html[object_store]

supported `url`

[source,yaml]

----

global:

  parquet:

    url: 's3://hotdog/streams/'

    # Bytes to buffer

    buffer: 1024000

    flush_ms: 60000

----

[TIP]

====

The `url` can be omitted from the configuration and specified in the environment via `S3_OUTPUT_URL`

====

[[yml-metrics]]

==== Metrics

The `global.metrics` configuration tells `hotdog` where to send its own

internal metrics  The only _currently_ supported metrics format is

link:https://github.com/statsd/statsd[statsd].

If your environment doesn't use statsd or you do not wish to report metrics,

set the `statsd` value to an invalid host and port.

.hotdog.yml

[source,yaml]

----

global:

  metrics:

    statsd: 'localhost:8125'

----

[[yml-status]]

==== Status

The `global.status` configuration is fully _optional_ but when it is enabled `hotdog`

will spin up an HTTP server on the configured `address` and `port` in order to provide

real-time status information about the daemon's runtime to HTTP clients.

.hotdog.yml

[source,yaml]

----

global:

  status:

    address: 'localhost'

    port: 8585

----

[[rules]]

=== Rules

Hotdog's rules define how it should handle and route the syslog messages it

receives. In the `hotdog.yml`, the rules must be defined as an array of maps.

Each rule is expected to a "matcher" (either <> or

<>), the `field`  upon which the matcher should

apply, and the <> defining how the message should be

handled.

.hotdog.yml

[source,yaml]

----

rules:

  - jmespath: 'meta.topic'

    field: msg

    actions:

      - type: forward

        topic: '{{value}}'

  # Catch-all, send everything else to a "logs-unknown" topic

  - regex: '.*'

    field: msg

    actions:

      - type: forward

        topic: 'logs-unknown'

----

.Supported Fields

|===

| Name | Notes

| `msg`

| The actual message sent along from the syslog server

| `hostname`

| The sender's hostname, if available.

| `appname`

| The logging application, if available, which created the syslog entry

| `facility`

| The syslog logging facility, if available, which was used to create the syslog message. For example `kern`, `user`, `auth`, etc.

| `severity`

| The severity of the syslog message, if available. For example: `notice`, `err`, `crit`, etc.

|===

[[rules-regex]]

==== Matching with regular expressions

The `regex` matcher instructs `hotdog` to match the `field` against the defined

regular expression, which must follow the syntax of the

link:https://docs.rs/regex/1.3.7/regex/#syntax[regex crate].

The matcher supports named groups in the regular expression, which are then exposed to actions such as

<> and <>.

[CAUTION]

====

Named groups will **override** any built-in variables at the time of

substitution, so be careful you are not naming your groups anything which might

overlap with the built-in variable names

====

[[rules-jmespath]]

==== Matching with JMESPath

`hotdog` also supports matching on JSON based messages with

link:https://jmespath.org/[JMESPath] via the `jmespath` matcher. In order for a

match, the log message must be a valid JSON object or array. The value of the

match is also then exposed as a <> named `value`, which

can be used in actions such as <> or <>.

[[variables]]

==== Variables

Some actions, such as <>, can perform variable substitutions on

log line. The variables available are a combination of the built-in variables

listed below, and whatever named groups exist in the `regex` field of the

<>.

[[builtin-vars]]

.Built-in Variables

|===

| Name | Description

| `msg`

| The original log line message sent along from the syslog sender.

| `version`

| The version of `hotdog` which is processing the message.

| `iso8601`

| The ISO-8601 timestamp of when the message was processed.

|===

[[actions]]

==== Actions

Actions determine what `hotdog` should do with the given log line when it

receives it.

[[action-forward]]

===== Forward

The forward action implies the <> when used, since

the internally tracked `output` buffer is flushed when it is sent to Kafka.

[[action-merge]]

===== Merge

The `merge` action will only work when the log line is a JSON **object**. JSON

arrays, or other arbitrary strings will not merge properly, and cause **all**

subsequent actions for the given rule to be aborted.

.Parameters

|===

| Key | Value

| `json`

| A YAML map which will be merged with the JSON object deserialized from the matched log line.

|===

.hotdog.yml

[source,yaml]

----

    actions:

      - type: merge

        json:

          meta:

            hotdog:

              version: '{{version}}'

              timestamp: '{{iso8601}}'

----

[[action-replace]]

===== Replace

The `template` may utilize the <> in

order to generate a modified message. The output is only available to

subsequent actions defined _after_ the `replace` action. Subsequent rules in

the chain **will not** utilize this generated message.

.Parameters

|===

| Key | Value

| `template`

| A link:https://handlebarsjs.com/[Handlebars]-style template which can be used to output a modified message.

|===

.hotdog.yml

[source,yaml]

----

  - regex: '^hello\s+(?P\w+)?'

    actions:

      - type: replace

        template: |

          Why hello there {{name}}!

----

[[action-stop]]

===== Stop

The `stop` action does nothing more than stop processing on the message. It is

not particularly useful except in cases where `hotdog` should match on a

message and then effectively discard it.

[[metrics]]

== Metrics

`hotdog` is designed to emit Statsd metrics to the statsd endpoint configured

in the <> section. Each metric will be prefixed under `hotdog.*`.

|===

| Key | Description

| `hotdog.connections`

| Gauge tracking the number of connections

| `hotdog.lines`

| Counter tracking the number of lines received by `hotdog`

| `hotdog.kafka.submitted`

| Counter tracking the number of messages submitted to Kafka

| `hotdog.kafka.submitted.`

| Counter tracking the number of messages submitted to each Kafka topic

| `hotdog.kafka.producer.sent`

| Timer which tracks the amount of time it takes to actually write messages to Kafka

| `hotdog.kafka.producer.error.*`

| Counters which count the number of different errors encountered while sending messages to Kafka. The types of possible metric names depends on the link:https://docs.rs/rdkafka/0.23.1/rdkafka/error/enum.RDKafkaError.html[RDKafkaError] enumeration from the underlying library.

| `hotdog.error.log_parse`

| Number of the log lines received which could not be parsed as link:https://tools.ietf.org/html/rfc5424[RFCC 5424] syslog lines.

| `hotdog.error.full_internal_queue`

| Count tracking the number of log lines which were *dropped* due to a full internal queue, Typically indicates an issue between `hotdog` and the Kafka brokers.

| `hotdog.error.internal_push_failed`

| Number of lines dropped because the could not be sent into the internal queue.

| `hotdog.error.topic_parse_failed`

| Number of lines dropped because the configured dynamic topic could not be parsed properly (typically indicates a configuration error).

| `hotdog.error.merge_of_invalid_json`

| Count of lines which could not have a merge action applied as configured due to a configuration error

| `hotdog.error.merge_target_not_json`

| Count of lines received for a merge action which were not JSON, and therefore could not be merged.

|===

[[development]]

== Development

Hotdog is tested against the latest Rust stable. A simple `cargo build` should

compile a working `hotdog` binary for your platform.

On Linux systems it is easy to test with:

[source,bash]

----

logger --server 127.0.0.1  -T -P 1514 "hello world"

logger --server 127.0.0.1  -T -P 1514 -f example.log

----

For TLS connections, you can use the `openssl` `s_client` command:

[source,bash]

----

echo  '<13>1 2020-04-18T15:16:09.956153-07:00 coconut tyler - - [timeQuality tzKnown="1" isSynced="1" syncAccuracy="505061"] hello world' | openssl s_client -connect localhost:6514

----

=== Profiling

Profiling `hotdog` is best done on a Linux host with the `perf` tool, e.g.

[source,bash]

----

RUST_LOG=info perf record --call-graph dwarf -- ./target/debug/hotdog -c ./hotdog.yml

perf report -ng --no-inline

----

By default this may run with a single thread, to increase the parallelism of

:hotdog: while profiling, be sure to use the `SMOL_THREADS` environment

variable.

The [hotspot](https://github.com/KDAB/hotspot) profiler visualizer tool works

well with the generated repors.

== Similar Projects

`hotdog` was originally motivated by challenges with

link:https://github.com/rsyslog/rsyslog[rsyslog], a desire for a simple

configuration, and the need for built-in metrics.

Some other similar projects which can be used to get logs into Kafka:

* link:https://github.com/elastic/logstash[logstash]

* link:https://github.com/syslog-ng/syslog-ng[syslog-ng]

* link:https://github.com/timberio/vector[vector]

* link:https://github.com/uswitch/syslogger[syslogger], which doesn't process

  messages itself, but rather integrates with `rsyslog`.
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/buoyant-data/hotdog

Awesome Lists containing this project

README