https://github.com/teragrep/rsm_01
Teragrep record schema mapper library for Java
https://github.com/teragrep/rsm_01
data data-mining data-science datascience java-library liblognorm log-analysis log-management schema-mapper structured-data structured-logging teragrep unstructured-data
Last synced: 28 days ago
JSON representation
Teragrep record schema mapper library for Java
- Host: GitHub
- URL: https://github.com/teragrep/rsm_01
- Owner: teragrep
- License: agpl-3.0
- Created: 2024-11-12T09:52:27.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2025-04-08T07:58:25.000Z (about 1 month ago)
- Last Synced: 2025-04-08T08:45:38.654Z (about 1 month ago)
- Topics: data, data-mining, data-science, datascience, java-library, liblognorm, log-analysis, log-management, schema-mapper, structured-data, structured-logging, teragrep, unstructured-data
- Language: Java
- Homepage: https://teragrep.com
- Size: 44.9 KB
- Stars: 0
- Watchers: 2
- Forks: 3
- Open Issues: 5
-
Metadata Files:
- Readme: README.adoc
- License: LICENSE
Awesome Lists containing this project
README
= Teragrep Record Schema Mapper Library for Java
Rsm_01 is a library that provides record schema mapping (log normalization) functionality to Java using Liblognorm library that is written in C-code. All the functionalities of the Liblognorm library are provided to Java through the use of Java Native Access (JNA) library.
== Features
- Provides full access to log normalization functionalities of Liblognorm C-library in Java, which in short allows:
. Matching a message line to a rule from provided rulebase configuration.
. Picking out variable fields from the line according to the configuration.
. Returning the picked variable fields as a JSON string object.
- Capability to handle errors occurring in the C-library through Java exceptions.== How to compile
https://maven.apache.org/install.html[Maven] and liblognorm-devel must be installed to build the project.
Project building has been tested on Fedora 41.
[,bash]
----
dnf install liblognorm-devel
----[,bash]
----
mvn clean package
----== How to use
=== Configuration
To use rsm_01 for log normalization, two things are required:
. Log messages
. A rulebase, which configures how the messages are normalized by liblognormThe rulebase consists of lines of text where each line reflects the structure of your log messages. Liblognorm will generate a parse-tree from the rulebase and then use it to parse the log messages during normalization. The provided rulebase must follow the version two configuration guidelines described in the https://www.liblognorm.com/files/manual/configuration.html[liblognorm documentations].
A simple rulebase file with two different rules is structured as follows:
[,txt]
----
version=2
rule=:%all:rest%
rule=tag:Amount: %N:number%
----=== Normalization
Normalization is done by first initializing `LognormFactory` object with the rulebase given as an input argument for the constructor. The `lognorm()` method of the factory object is then used to provide a configured version of `JavaLognormImpl` object that is capable of normalizing messages through the `normalize()` method.
The rulebase can be provided to the `LognormFactory` constructor either as a `String` or as a `File` object. Rulebases in `String` format do not require the `version=2` indicator mentioned in liblognorm documentations.
Because of C memory management requirements, the `JavaLognormImpl` object must be closed when it is no longer needed. The typical use-case for normalization using try-with-resources is as follows:
[,java]
----
LognormFactory lognormFactory = new LognormFactory("rule=:%all:rest%");
try (JavaLognormImpl javaLognormImpl = lognormFactory.lognorm()) {
String normalizedMessage = javaLognormImpl.normalize("message to normalize");
}
----If normalization was a success, `normalizedMessage` will hold the normalization result in JSON string format.
=== Additional configuration options
Liblognorm API has additional configuration options available for normalization:
. `LN_CTXOPT_ADD_ORIGINALMSG` — always adds original message to the normalization result.
. `LN_CTXOPT_ADD_RULE` — adds mockup rule to the normalization result.
. `LN_CTXOPT_ADD_RULE_LOCATION` — adds rule location (file, lineno) to the normalization result.The configuration for the additional options is done by providing `OptionsStruct` object as an additional input argument for `LognormFactory` constructor.
[,java]
----
LibJavaLognorm.OptionsStruct opts = new LibJavaLognorm.OptionsStruct();
opts.CTXOPT_ADD_ORIGINALMSG = true;
LognormFactory lognormFactory = new LognormFactory(opts, "rule=:%all:rest%");
try (JavaLognormImpl javaLognormImpl = lognormFactory.lognorm()) {
String normalizedMessage = javaLognormImpl.normalize("message to normalize");
}
----== Contributing
You can involve yourself with our project by https://github.com/teragrep/rsm_01/issues/new/choose[opening an issue] or submitting a pull request.
Contribution requirements:
. *All changes must be accompanied by a new or changed test.* If you think testing is not required in your pull request, include a sufficient explanation as why you think so.
. Security checks must pass
. Pull requests must align with the principles and http://www.extremeprogramming.org/values.html[values] of extreme programming.
. Pull requests must follow the principles of Object Thinking and Elegant Objects (EO).Read more in our https://github.com/teragrep/teragrep/blob/main/contributing.adoc[Contributing Guideline].
=== Contributor License Agreement
Contributors must sign https://github.com/teragrep/teragrep/blob/main/cla.adoc[Teragrep Contributor License Agreement] before a pull request is accepted to organization's repositories.
You need to submit the CLA only once. After submitting the CLA you can contribute to all Teragrep's repositories.