https://github.com/kobsio/klogs

Fast, scalable and reliable logging using Fluent Bit and ClickHouse
https://github.com/kobsio/klogs

clickhouse fluent-bit kobs kobsio kubernetes logging

Last synced: 4 months ago
JSON representation

Fast, scalable and reliable logging using Fluent Bit and ClickHouse

Host: GitHub
URL: https://github.com/kobsio/klogs
Owner: kobsio
License: mit
Created: 2021-08-27T18:26:24.000Z (almost 4 years ago)
Default Branch: main
Last Pushed: 2024-11-04T15:57:43.000Z (8 months ago)
Last Synced: 2024-11-04T16:36:46.215Z (8 months ago)
Topics: clickhouse, fluent-bit, kobs, kobsio, kubernetes, logging
Language: Go
Homepage: https://kobs.io/main/plugins/klogs/
Size: 1.33 MB
Stars: 60
Watchers: 3
Forks: 8
Open Issues: 7
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-clickhouse - kobsio/klogs - klogs is a fast and reliable tool for logging that integrates Fluent Bit with ClickHouse. (Integrations / Data Transfer and Synchronization)

README

        # klogs

**klogs** can be used to write the logs collected by [Fluent Bit](https://fluentbit.io) to [ClickHouse](https://clickhouse.tech). You can use klogs with or without Kafka:

- **[Fluent Bit Plugin](cmd/plugin):** The klogs Fluent Bit plugin allows you to directly write the collected logs from Fluent Bit into ClickHouse.

- **[ClickHouse Ingester](cmd/ingester):** The klogs ClickHouse ingester allows you to write your logs from Fluent Bit into Kafka, so that the ingester can write them from Kafka into ClickHouse.

You can use [kobs](https://kobs.io) as interface to get the logs from ClickHouse. More information regarding the klogs plugin for kobs can be found in the [klogs](https://kobs.io/plugins/klogs/) documentation of kobs.

![kobs](assets/kobs.png)

## Configuration

The configuration for the **Fluent Bit Plugin** and **ClickHouse Ingester** can be found in the corresponding directories in the `cmd` folder.

The SQL schema for ClickHouse must be created on each ClickHouse node and looks as follows:

```sql

CREATE DATABASE IF NOT EXISTS logs ON CLUSTER `{cluster}` ENGINE=Atomic;

CREATE TABLE IF NOT EXISTS logs.logs_local ON CLUSTER `{cluster}`

(

    `timestamp` DateTime64(3) CODEC(Delta, LZ4),

    `cluster` LowCardinality(String),

    `namespace` LowCardinality(String),

    `app` LowCardinality(String),

    `pod_name` LowCardinality(String),

    `container_name` LowCardinality(String),

    `host` LowCardinality(String),

    `fields_string` Map(LowCardinality(String), String),

    `fields_number` Map(LowCardinality(String), Float64),

    `log` String CODEC(ZSTD(1))

)

ENGINE = ReplicatedMergeTree

PARTITION BY toDate(timestamp)

ORDER BY (cluster, namespace, app, pod_name, container_name, host, timestamp)

TTL toDateTime(timestamp) + INTERVAL 30 DAY;

CREATE TABLE IF NOT EXISTS logs.logs ON CLUSTER '{cluster}' AS logs.logs_local ENGINE = Distributed('{cluster}', logs, logs_local, rand());

```

To speedup queries for the most frequently queried fields we can create dedicated columns for specific fiels:

```sql

ALTER TABLE logs.logs_local ON CLUSTER '{cluster}' ADD COLUMN content_level String DEFAULT fields_string['content.level']

ALTER TABLE logs.logs ON CLUSTER '{cluster}' ADD COLUMN content_level String DEFAULT fields_string['content.level']

ALTER TABLE logs.logs_local ON CLUSTER '{cluster}' ADD COLUMN content_response_code Float64 DEFAULT fields_number['content.response_code']

ALTER TABLE logs.logs ON CLUSTER '{cluster}' ADD COLUMN content_response_code Float64 DEFAULT fields_number['content.response_code']

```

But those columns will be materialized only for new data and after merges. In order to materialize those columns for old data:

- You can use `ALTER TABLE MATERIALIZE COLUMN` for ClickHouse version > 21.10.

```sql

ALTER TABLE logs.logs_local ON CLUSTER '{cluster}' MATERIALIZE COLUMN content_level;

ALTER TABLE logs.logs_local ON CLUSTER '{cluster}' MATERIALIZE COLUMN content_response_code;

```

- Or for older ClickHouse versions, `ALTER TABLE UPDATE`.

```sql

ALTER TABLE logs.logs_local ON CLUSTER '{cluster}' UPDATE content_level = content_level WHERE 1;

ALTER TABLE logs.logs_local ON CLUSTER '{cluster}' UPDATE content_response_code = content_response_code WHERE 1;

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/kobsio/klogs

Awesome Lists containing this project

README