{"id":18339784,"url":"https://github.com/relex/slog-agent","last_synced_at":"2025-04-06T05:32:17.938Z","repository":{"id":43233421,"uuid":"358157283","full_name":"relex/slog-agent","owner":"relex","description":"High performance log agent/processor to be used with fluentd","archived":false,"fork":false,"pushed_at":"2024-07-01T06:41:21.000Z","size":579,"stargazers_count":8,"open_issues_count":12,"forks_count":1,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-03-21T18:09:11.201Z","etag":null,"topics":["fluentd","log-parser","logging"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/relex.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-04-15T06:50:57.000Z","updated_at":"2024-07-01T06:41:22.000Z","dependencies_parsed_at":"2023-01-28T13:32:42.871Z","dependency_job_id":"7502b1e5-c2de-4e19-a4ec-d3e6c4c0c09a","html_url":"https://github.com/relex/slog-agent","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/relex%2Fslog-agent","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/relex%2Fslog-agent/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/relex%2Fslog-agent/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/relex%2Fslog-agent/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/relex","download_url":"https://codeload.github.com/relex/slog-agent/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247440591,"owners_count":20939221,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["fluentd","log-parser","logging"],"created_at":"2024-11-05T20:19:22.449Z","updated_at":"2025-04-06T05:32:16.689Z","avatar_url":"https://github.com/relex.png","language":"Go","readme":"# slog-agent\n\nA log agent designed to process and filter massive amounts of logs in reat-time and forward to upsteam (fluentd)\n\n\n## What we built this for\n\nWe have hundreds of thousands of application logs *per second* that need to be processed or filtered as quickly as\npossible, *for each server*.\n\nAt the target rate of one million logs per second, every steps could be bottlenecks and conventional log processors\nare not designed to handle that sort of traffic. This agent is built to be extremely efficient, both memory and CPU\nwise, and also to be able to scale up to multiple CPU cores *efficiently*, at the cost of everything else.\n\nA possibly baised and unfair comparison of this vs Lua transform with fluent-bit, is roughly 0.5M log/s from network\ninput, processed and gzipped at 1:20-50 ratio (2 cores), vs 50K log/s from file and uncompressed (one core) for the\nsame processing steps. We also tested [Vector](https://vector.dev/) with similar but worse results.\n\n\n## What you need to adopt this\n\nYou need basic understanding of [Go](https://golang.org/), to be ready to write new transforms and dig into profiling\nreports.\n\nThings are slow on generic log processors for very good reasons - For example, a simple matching by regular expression\ncould be 50 times slower than a [special glob pattern](https://github.com/gobwas/glob), and allocates tons of buffers\nin memory heap which then need more CPU time to be GC'ed. The boundary crossing scripting interface is another\nbottleneck, with marshalling and unmarshalling of each records that could cost more than the script execution itself.\n\nWithout any of such generic and flexible transforms and parsers, everything needs to be done in manually written code,\nor blocks of code that can be assembled together - which is essentially what this log agent provides, a base and blocks\nof code for you to build high performance log processors - but only if you need that kind of performance. The design is\npluggable and the program is largely configurable, but you're going to run into situations which can only be solved by\nwriting new code.\n\n\n## Features\n\n- Input: RFC 5424 Syslog protocol via TCP, with experimental multiline support\n- Transforms: field extraction and creations, drop, truncate, if/switch, email redaction\n- Buffering: hybrid disk+memory buffering - compressed and only persisted when necessary\n- Output: Fluentd Forward protocol, both compressed and uncompressed. Single output only.\n- Metrics: Prometheus metrics to count logs and log size by key fields (e.g. vhost + log level + filename)\n\nDynamic fields are not supported - All fields must be known in configuration because they're packed in arrays that can\nbe accessed without hashmap lookup.\n\n\"tags\" or similar concept doesn't exist here. Instead there are \"if\" and \"switch-case\" matching field values.\n\nSee the [sample configurations](testdata/config_sample.yml) for full features.\n\n#### Performance and Backpressure\n\nLogs are compressed and saved in chunk files if output cannot clear the logs fast enough. The maximum numbers of\npending chunks for each pipeline (key field set) are limited and defined in [defs/params.go](defs/params.go).\n\nInput would be paused if logs cannot be processed fast enough - since RFC 5424 doesn't support any pause mechanism,\nit'd likely cause internal errors on both the agent and the logging application, but would not affect other\napplications' logging if pipelines are properly set-up / isolated (e.g. by app-name and vhost).\n\nFor a typical server CPU (e.g. Xeon, 2GHz), a single pipeline / core should be able to handle at least:\n\n- 300-500K log/s for small logs, around 100-200 bytes each including syslog headers\n- 200K log/s or 400MB/s for larger logs\n\nNote on servers with more than a few dozens of CPU cores, an optimal `GOMAXPROCS` has to be measured and set for\nproduction workload, until https://github.com/golang/go/issues/28808 is resolved\n\n## Build\n\nRequires [gotils](https://github.com/relex/gotils) which provides build tools\n\n```bash\nmake\nmake test\n```\n## Operation manual\n\n#### Configuration\n\nSee [sample configurations](testdata/config_sample.yml).\n\nExperimental configuration reloading is supported by starting with `--allow_reload` and sending `SIGHUP`; See\n[testdata/config_sample.yml] for details on which sections may be reconfigured. In general everything after inputs\nare re-configurable. If reconfiguration fails, errors are logged and the agent would continue to run with old\nconfiguration, without any side-effect.\n\nNote after successful reloading, some of previous logs may be sent to upstream again if they hadn't been acknowledged\nin time.\n\nThe metric family `slogagent_reloads_total` counts sucesses and failures of reconfigurations.\n\nCurrently it is not possible to recover previously queued logs if `orchestration/keys` have been changed.\n\n#### Runtime diagnosis\n\n- `SIGHUP` aborts and recreates all pipelines with new config loaded from the same file. Incoming connections are unaffected.\n- `SIGUSR1` recreates all outgoing connections or sessions gracefully.\n- http://localhost:METRICS_PORT/ provides Golang's builtin debug functions in addition to metrics, such as stackdump and profiling.\n\n## Development\n\n#### Mark inlinable code\n\nAdd `xx:inline` comment on the same line as function declaration\n\n```go\nfunc (s *xLogSchema) GetFieldName(index int) string { // xx:inline\n```\n\nIf this function is too complex to be inlined, build would fail with a warning.\n\n#### Re-generate templated source (.tpl.go)\n\n```bash\nmake gen\n```\n\n#### Re-generate expected output in integration tests\n\n```bash\nmake test-gen\n```\n\n## Runtime Diagnosis\n\nPrometheus listener address (default *:9335*) exposes go's `debug/pprof` in addition to metrics, which can dump\ngoroutine stacks.\n\nOptions:\n\n- `--cpuprofile FILE_PATH`: enable GO CPU profiling, with some overhead\n- `--memprofile FILE_PATH`: enable GO CPU profiling\n- `--trace FILE_PATH`: enable GO tracing\n\n## Benchmark \u0026 Profiling\n\nExample:\n\n```bash\nLOG_LEVEL=warn time BUILD/slog-agent benchmark agent --input 'testdata/development/*.log' --repeat 250000 --config testdata/config_sample.yml --output null --cpuprofile /tmp/agent.cpu --memprofile /tmp/agent.mem\ngo tool pprof -http=:8080 BUILD/slog-agent /tmp/agent.cpu\n```\n\n`--output` supports several formats:\n\n- `` (empty): default to forward to fluentd as defined in config.\n  Chunks may not be fully sent when shutdown and unsent chunks would be saved for next run.\n- `null`: no output. Results are compressed as in normal routine, counted and then dropped.\n- `.../%s`: create fluentd forward message files each of individual chunks at the path (`%s` as chunk ID). The dir must exist first.\n- `.../%s.json`: create JSON files each of individual chunks at the path (`%s` as chunk ID). The dir must exist first.\n\nfluentd forward message files can be examined by [fluentlibtool](https://github.com/relex/fluentlib)\n\n## Internals\n\nSee [DESIGN](DESIGN.md)\n\n#### Key dependencies\n\n- [fluentlib](https://github.com/relex/fluentlib) for fluentd forward protocol, fake server and dump tool for testing.\n- [klauspost' compress library](github.com/klauspost/compress) for fast gzip compression which is absolutely critical\n  to the agent: always benchmark before upgrade. *compression takes 1/2 to 1/3 of CPU time in our environments*\n- [YAML v3](gopkg.in/yaml.v3) required for custom tags in configuration. `KnownFields` is still not working and it\n  cannot check non-existent or misspelled YAML properties.\n\n## Authors\n\nSpecial thanks to _Henrik Sjöström_ for his guiding on Go optimization, integration testing and invaluable suggestions\non performant design.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frelex%2Fslog-agent","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frelex%2Fslog-agent","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frelex%2Fslog-agent/lists"}