https://github.com/kobsio/klogs
Fast, scalable and reliable logging using Fluent Bit and ClickHouse
https://github.com/kobsio/klogs
clickhouse fluent-bit kobs kobsio kubernetes logging
Last synced: about 2 months ago
JSON representation
Fast, scalable and reliable logging using Fluent Bit and ClickHouse
- Host: GitHub
- URL: https://github.com/kobsio/klogs
- Owner: kobsio
- License: mit
- Created: 2021-08-27T18:26:24.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2024-11-04T15:57:43.000Z (6 months ago)
- Last Synced: 2024-11-04T16:36:46.215Z (6 months ago)
- Topics: clickhouse, fluent-bit, kobs, kobsio, kubernetes, logging
- Language: Go
- Homepage: https://kobs.io/main/plugins/klogs/
- Size: 1.33 MB
- Stars: 60
- Watchers: 3
- Forks: 8
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-clickhouse - kobsio/klogs - klogs is a fast and reliable tool for logging that integrates Fluent Bit with ClickHouse. (Integrations / Data Transfer and Synchronization)
README
# klogs
**klogs** can be used to write the logs collected by [Fluent Bit](https://fluentbit.io) to [ClickHouse](https://clickhouse.tech). You can use klogs with or without Kafka:
- **[Fluent Bit Plugin](cmd/plugin):** The klogs Fluent Bit plugin allows you to directly write the collected logs from Fluent Bit into ClickHouse.
- **[ClickHouse Ingester](cmd/ingester):** The klogs ClickHouse ingester allows you to write your logs from Fluent Bit into Kafka, so that the ingester can write them from Kafka into ClickHouse.You can use [kobs](https://kobs.io) as interface to get the logs from ClickHouse. More information regarding the klogs plugin for kobs can be found in the [klogs](https://kobs.io/plugins/klogs/) documentation of kobs.

## Configuration
The configuration for the **Fluent Bit Plugin** and **ClickHouse Ingester** can be found in the corresponding directories in the `cmd` folder.
The SQL schema for ClickHouse must be created on each ClickHouse node and looks as follows:
```sql
CREATE DATABASE IF NOT EXISTS logs ON CLUSTER `{cluster}` ENGINE=Atomic;CREATE TABLE IF NOT EXISTS logs.logs_local ON CLUSTER `{cluster}`
(
`timestamp` DateTime64(3) CODEC(Delta, LZ4),
`cluster` LowCardinality(String),
`namespace` LowCardinality(String),
`app` LowCardinality(String),
`pod_name` LowCardinality(String),
`container_name` LowCardinality(String),
`host` LowCardinality(String),
`fields_string` Map(LowCardinality(String), String),
`fields_number` Map(LowCardinality(String), Float64),
`log` String CODEC(ZSTD(1))
)
ENGINE = ReplicatedMergeTree
PARTITION BY toDate(timestamp)
ORDER BY (cluster, namespace, app, pod_name, container_name, host, timestamp)
TTL toDateTime(timestamp) + INTERVAL 30 DAY;CREATE TABLE IF NOT EXISTS logs.logs ON CLUSTER '{cluster}' AS logs.logs_local ENGINE = Distributed('{cluster}', logs, logs_local, rand());
```To speedup queries for the most frequently queried fields we can create dedicated columns for specific fiels:
```sql
ALTER TABLE logs.logs_local ON CLUSTER '{cluster}' ADD COLUMN content_level String DEFAULT fields_string['content.level']
ALTER TABLE logs.logs ON CLUSTER '{cluster}' ADD COLUMN content_level String DEFAULT fields_string['content.level']ALTER TABLE logs.logs_local ON CLUSTER '{cluster}' ADD COLUMN content_response_code Float64 DEFAULT fields_number['content.response_code']
ALTER TABLE logs.logs ON CLUSTER '{cluster}' ADD COLUMN content_response_code Float64 DEFAULT fields_number['content.response_code']
```But those columns will be materialized only for new data and after merges. In order to materialize those columns for old data:
- You can use `ALTER TABLE MATERIALIZE COLUMN` for ClickHouse version > 21.10.
```sql
ALTER TABLE logs.logs_local ON CLUSTER '{cluster}' MATERIALIZE COLUMN content_level;
ALTER TABLE logs.logs_local ON CLUSTER '{cluster}' MATERIALIZE COLUMN content_response_code;
```- Or for older ClickHouse versions, `ALTER TABLE UPDATE`.
```sql
ALTER TABLE logs.logs_local ON CLUSTER '{cluster}' UPDATE content_level = content_level WHERE 1;
ALTER TABLE logs.logs_local ON CLUSTER '{cluster}' UPDATE content_response_code = content_response_code WHERE 1;
```