https://github.com/y-scope/log-surgeon
A performant log parsing library
https://github.com/y-scope/log-surgeon
Last synced: 5 months ago
JSON representation
A performant log parsing library
- Host: GitHub
- URL: https://github.com/y-scope/log-surgeon
- Owner: y-scope
- License: apache-2.0
- Created: 2023-02-28T15:47:26.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2026-01-12T07:08:30.000Z (5 months ago)
- Last Synced: 2026-01-12T17:14:49.453Z (5 months ago)
- Language: C++
- Homepage:
- Size: 618 KB
- Stars: 9
- Watchers: 4
- Forks: 10
- Open Issues: 47
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Codeowners: .github/CODEOWNERS
Awesome Lists containing this project
README
# log-surgeon: A performant log parsing library
[](https://yscope-clp.zulipchat.com/)
`log-surgeon` is a library for high-performance parsing of unstructured text
logs. It allows users to parse and extract information from the vast amount of
unstructured logs generated by today's open-source software.
Some of the library's features include:
* Parsing and extracting variable values like the log event's log-level and any
other user-specified variables, no matter where they appear in each log event.
* Parsing by using regular expressions for each variable type rather than
regular expressions for an entire log event.
* Improved latency, and memory efficiency compared to popular regex engines.
* Parsing multi-line log events (delimited by timestamps).
Note that `log-surgeon` is *not* a generic regex engine and does impose [some
constraints](docs/parsing-constraints.md) on how log events can be parsed.
## Motivating example
Let's say we want to parse and inspect multi-line log events like this:
```
2023-02-23T18:10:14-0500 DEBUG task_123 crashed. Dumping stacktrace:
#0 0x000000000040110e in bar () at example.cpp:6
#1 0x000000000040111d in bar () at example.cpp:10
#2 0x0000000000401129 in main () at example.cpp:15
```
Using the [example schema file](examples/schema.txt) which includes these rules:
```
timestamp:\d{4}\-\d{2}\-\d{2}T\d{2}:\d{2}:\d{2}\-\d{4}
...
loglevel:INFO|DEBUG|WARN|ERROR
```
We can parse and inspect the events as follows:
```cpp
// Define a reader to read from your data source
Reader reader{/* */};
// Instantiate the parser
ReaderParser parser{"examples/schema.txt"};
parser.reset_and_set_reader(reader);
// Get the loglevel variable's ID
optional loglevel_id{parser.get_variable_id("loglevel")};
//
while (false == parser.done()) {
if (ErrorCode err{parser.parse_next_event()}; ErrorCode::Success != err) {
throw runtime_error("Parsing Failed");
}
LogEventView const& event{parser.get_log_parser().get_log_event_view()};
// Get and print the timestamp
Token* timestamp{event.get_timestamp()};
if (nullptr != timestamp) {
cout << "timestamp: " << timestamp->to_string_view() << endl;
}
// Get and print the log-level
auto const& loglevels{event.get_variables(*loglevel_id)};
if (false == loglevels.empty()) {
// In case there are multiple matches, just get the first one
cout << "loglevel:" << loglevels[0]->to_string_view() << endl;
}
// Other analysis...
// Print the entire event
cout << event.to_string() << endl;
}
```
For advanced uses, `log-surgeon` also has a
[BufferParser](examples/buffer-parser.cpp) that reads directly from a buffer.
## Building and installing
Requirements:
* CMake >= 3.22.1
* GCC >= 10 or Clang >= 7
* [Catch2] >= 3.8.1
* [fmt] >= 11.2.0
* [GSL] >= 4.0.0
* [Task] >= 3.38
* [uv] >= 0.7.10
* [ystdlib-cpp] >= 0.1.0
To build and install the project to `$HOME/.local`:
```shell
task log-surgeon:install-release INSTALL_PREFIX="$HOME/.local"
```
Or to only build the project:
```shell
task log-surgeon:build-release
```
To build the debug version:
```shell
task log-surgeon:build-debug
```
## Examples
[examples](examples) contains programs demonstrating usage of the library. See
[examples/README.md](examples/README.md) for information on building and running the examples.
## Documentation
* [docs](docs) contains more detailed documentation including:
* The [schema specification](docs/schema.md), which describes the syntax for
writing your own schema
* `log-surgeon`'s [design objectives](docs/design-objectives.md)
### Documentation site
The project includes a documentation site that's useful for exploring functionality and test
coverage. In particular, it documents all unit tests, with additional detail for API-level tests.
To generate and view the files:
* Run `task docs:site`.
* Open `build/docs/html/index.html` in your preferred browser.
To host the site locally and view it:
* Run `task docs:serve`.
* Open the URL output by the task in your preferred browser.
## Testing
To build and run all unit tests:
```shell
task test:run-debug
```
When generating targets, the CMake variable `BUILD_TESTING` is followed (unless overruled by setting
`log_surgeon_BUILD_TESTING` to false). By default, if built as a top-level project, `BUILD_TESTING`
is set to true and unit tests are built.
## Linting
Before submitting a PR, ensure you've run our linting tools and either fixed any violations or
suppressed the warning.
### Running the linters
To report all errors, run:
```shell
task lint:check
```
To automatically fix any supported format or linting errors, run:
```shell
task lint:fix
```
## Providing feedback
You can use GitHub issues to [report a bug][bug-report] or [request a feature][feature-req].
Join us on [Zulip](https://yscope-clp.zulipchat.com/) to chat with developers
and other community members.
## Known issues
The following are issues we're aware of and working on:
* Schema rules must use ASCII characters. We will release UTF-8 support in a
future release.
* Timestamps must appear at the start of the message to be handled specially
(than other variable values) and support multi-line log events.
* A variable pattern has no way to match text around a variable, without having
it also be a part of the variable.
* Support for submatch extraction will be coming in a future release.
[bug-report]: https://github.com/y-scope/log-surgeon/issues/new?assignees=&labels=bug&template=bug-report.yaml
[Catch2]: https://github.com/catchorg/Catch2/tree/devel
[feature-req]: https://github.com/y-scope/log-surgeon/issues/new?assignees=&labels=enhancement&template=feature-request.yaml
[fmt]: https://github.com/fmtlib/fmt
[GSL]: https://github.com/microsoft/GSL
[Task]: https://taskfile.dev/
[uv]: https://docs.astral.sh/uv
[ystdlib-cpp]: https://github.com/y-scope/ystdlib-cpp