Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/childe/gohangout
使用 golang 模仿的 Logstash。用于消费 Kafka 数据,处理后写入 ES、Clickhouse 等。
https://github.com/childe/gohangout
elasticsearch golang kafka logstash
Last synced: 28 days ago
JSON representation
使用 golang 模仿的 Logstash。用于消费 Kafka 数据,处理后写入 ES、Clickhouse 等。
- Host: GitHub
- URL: https://github.com/childe/gohangout
- Owner: childe
- License: mit
- Created: 2018-01-25T07:04:02.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2024-09-26T06:45:09.000Z (about 2 months ago)
- Last Synced: 2024-09-30T23:40:56.106Z (about 1 month ago)
- Topics: elasticsearch, golang, kafka, logstash
- Language: Go
- Homepage:
- Size: 853 KB
- Stars: 1,016
- Watchers: 32
- Forks: 238
- Open Issues: 5
-
Metadata Files:
- Readme: README-EN.md
- License: LICENSE
Awesome Lists containing this project
- my-awesome - childe/gohangout - 10 star:1.0k fork:0.2k 使用 golang 模仿的 Logstash。用于消费 Kafka 数据,处理后写入 ES、Clickhouse 等。 (Go)
README
Gohangout is an application to do data transport. It consumes data from `input plugin` such as kafka or tcp/udp , and do data transforms using `filter plugin`, and then emit data to `output plugin`, such as Elasticsearch or Clickhouse.
## Install
We could build it from source code , or download binary app.
### build from source code
just clone code and run make
```
make
```It is recommended to compile it with CGO disabled if you want to run it in docker.
```
GOOS=linux GOARCH=amd64 CGO_ENABLED=0 make
```### download binary
[https://github.com/childe/gohangout/releases](https://github.com/childe/gohangout/releases)
### go get
```
go get github.com/childe/gohangout
```### Third party plugin
- Exmaples for developing 3th party plugin [gohangout-plugin-examples](https://github.com/childe/gohangout-plugin-examples)
- [Kafka Input using Saramp](https://github.com/DukeAnn/gohangout-input-kafka_sarama)
- [Kafka Input using kafka-go](https://github.com/huangjacky/gohangout-input-kafkago)
- [Redis Input](https://github.com/childe/gohangout-input-redis)
- [Split Filter](https://github.com/childe/gohangout-plugin-examples/tree/master/gohangout-filter-split) Split one message to multi
- [File Output](https://github.com/childe/gohangout-plugin-examples/tree/master/gohangout-file-output) file output## Run
```
gohangout --config config.yml
```### log
Gohangout use klog.
use `-v n` to set log level.
I usually set n to 5. You can set it to 10 or 20 to see more detailed log.
### multi threads
--worker 4 (default 1)
above args make gohangout use 4 goroutines to process data. Notice: one thread to consume from input, and then have 4 goroutines to do filter and output.
### reload
--reload
above args enables reload. Gohangout will relaod config file when it changes.
`kill -USER $pid` also triggers reload.
## Simple Config Example
```
inputs:
- Kafka:
topic:
weblog: 1 # One Kafka consumer thread
codec: json
consumer_settings:
bootstrap.servers: "10.0.0.100:9092"
group.id: gohangout.weblog
filters:
- Grok:
src: message
match:
- '^(?P\S+) (?P\w+) (?P\d+)$'
- '^(?P\S+) (?P\d+) (?P\w+)$'
remove_fields: ['message']
- Date:
location: 'UTC'
src: logtime
formats:
- 'RFC3339'
remove_fields: ["logtime"]
outputs:
- Elasticsearch:
hosts:
- 'http://admin:[email protected]:9200'
index: 'web-%{appid}-%{+2006-01-02}'
index_type: "logs"
bulk_actions: 5000
bulk_size: 20
flush_interval: 60
```## Value Render Protocol
some exmaples and explanation below
```
fields:
logtime: '%{date} {%time}'
type: 'weblog'
hostname: '[host]'
name: '{{.firstname}}.{{.lastname}}'
name2: '$.name'
city: '[geo][cityname]'
'[a][b]': '[stored][message]'
```### format 1 JSONPATH
Gohangout use JsonPath to render value if it begins witch `$.`
```
$.store.book[0].title$['store']['book'][0]['title']
$.store.book[(@.length-1)].title
$.store.book[?(@.price < 10)].title
```More usage and examples: [https://goessner.net/articles/JsonPath/](https://goessner.net/articles/JsonPath/)
### format2 [X][Y]
**Not recommended, please use format 1**
`city: '[geo][cityname]'` equals to `$.geo.cityname` . It must be strictly `[X][Y]`, in other words, there can not be any other words in front of `[X][Y]` or after it.
### format 3 {{XXX}}
Gohangout will render value using [Golang Template]((https://golang.org/pkg/text/template/)). It could contains other words before or after {{XXX}}, such as `name: 'my name is {{.firstname}}.{{.lastname}}'`
One example you may use: We get a time-type field with `Date` filter, and then render a string with customed format.
```
Add:
fields:
ts: '{{ .ts.Format "2006.01.02" }}'
```### format 4 %{XXX}
it itherits from [logstash](https://www.elastic.co/logstash)
for example, render index name in Elasticsearch output: `web-%{appid}-%{+2006-01-02}` .
## Input
All settings in below plugins could be checked in [Chinese doc](https://github.com/childe/gohangout/blob/master/README.md#input).
Setting and explanation in English doc will be added later.
- Stdin
- TCP
- Kafka## Output
- Stdout
- TCP
- Elasticsearch
- Kafka
- ClickHouse## Filter
### general settings
#### if
syntax example
```
Drop:
if:
- 'EQ(name,"childe")'
- 'Before(-24h) || After(24h)'
```Relationship between conditions is **AND**, if passes only if all conditions pass.
more complicated example using bool operator: `Exist(a) && (!Exist(b) || !Exist(c))`
All functions supported for now:
**NOtice**: value in EQ/IN functions must be quoted by " , because the value could be a number or a string. User must tell Gohangout whether it is a string or a number.
value in other functions could be quoted by " or not , gohangout will treat it as string.- `Exist(user,name)` if [user][name] exists
- `EQ(user,age,20)` `EQ($.user.age,20)` if [user][age] exists and equal to 20
- `EQ(user,age,"20")` `EQ($.user.age,"20")` if [user][age] exists and equal to "20" (string)
- `IN(tags,"app")` `IN($.tags,"app")` tags is a list or do not pass. if "app" contained in the list
- `HasPrefix(user,name,liu)` `HasPrefix($.user.name,"liu")`
- `HasSuffix(user,name,jia)` `HasSuffix($.user.name,"jia")`
- `Contains(user,name,jia)` `Contains($.user.name,"jia")`
- `Match(user,name,^liu.*a$)` `Match($.user.name,"^liu.*a$")`
- `Random(20)` return true with 5% probability
- `Before(24h)` *@timestamp* field exists and it must be a Time type
- `After(-24h)` *@timestamp* field exists and it must be a Time type#### add_fields
example:
```
Grok:
src: message
match:
- '^(?P\S+) (?P\w+) (?P\d+)$'
- '^(?P\S+) (?P\d+) (?P\w+)$'
remove_fields: ['message']
add_fields:
grok_result: 'ok'
```Fields could be added if the filter process the event successfully. And it is ignored if filter failed to process the event.
#### remove_fields
```
Grok:
src: message
match:
- '^(?P\S+) (?P\w+) (?P\d+)$'
- '^(?P\S+) (?P\d+) (?P\w+)$'
remove_fields: ['message']
add_fields:
grok_result: 'ok'
```remove some fields if the filter process the event successfully. And it is ignored if filter failed to process the event.
### Filter Plugins
- Add
- Convert
- Date # it convert one string-type field to Time-type field
- Drop
- Filters
- Grok
- IPIP
- KV
- Lowercase
- Remove
- Rename
- Split
- Translate
- Uppercase
- Replace
- URLDecode