https://github.com/teragrep/rsm_01

Teragrep record schema mapper library for Java
https://github.com/teragrep/rsm_01

data data-mining data-science datascience java-library liblognorm log-analysis log-management schema-mapper structured-data structured-logging teragrep unstructured-data

Last synced: 28 days ago
JSON representation

Teragrep record schema mapper library for Java

Host: GitHub
URL: https://github.com/teragrep/rsm_01
Owner: teragrep
License: agpl-3.0
Created: 2024-11-12T09:52:27.000Z (6 months ago)
Default Branch: main
Last Pushed: 2025-04-08T07:58:25.000Z (about 1 month ago)
Last Synced: 2025-04-08T08:45:38.654Z (about 1 month ago)
Topics: data, data-mining, data-science, datascience, java-library, liblognorm, log-analysis, log-management, schema-mapper, structured-data, structured-logging, teragrep, unstructured-data
Language: Java
Homepage: https://teragrep.com
Size: 44.9 KB
Stars: 0
Watchers: 2
Forks: 3
Open Issues: 5
Metadata Files:
- Readme: README.adoc
- License: LICENSE

Awesome Lists containing this project

README

= Teragrep Record Schema Mapper Library for Java

Rsm_01 is a library that provides record schema mapping (log normalization) functionality to Java using Liblognorm library that is written in C-code. All the functionalities of the Liblognorm library are provided to Java through the use of Java Native Access (JNA) library.

== Features

- Provides full access to log normalization functionalities of Liblognorm C-library in Java, which in short allows:
. Matching a message line to a rule from provided rulebase configuration.
. Picking out variable fields from the line according to the configuration.
. Returning the picked variable fields as a JSON string object.
- Capability to handle errors occurring in the C-library through Java exceptions.

== How to compile

https://maven.apache.org/install.html[Maven] and liblognorm-devel must be installed to build the project.

Project building has been tested on Fedora 41.

[,bash]
----
dnf install liblognorm-devel
----

[,bash]
----
mvn clean package
----

== How to use

=== Configuration

To use rsm_01 for log normalization, two things are required:

. Log messages
. A rulebase, which configures how the messages are normalized by liblognorm

The rulebase consists of lines of text where each line reflects the structure of your log messages. Liblognorm will generate a parse-tree from the rulebase and then use it to parse the log messages during normalization. The provided rulebase must follow the version two configuration guidelines described in the https://www.liblognorm.com/files/manual/configuration.html[liblognorm documentations].

A simple rulebase file with two different rules is structured as follows:
[,txt]
----
version=2
rule=:%all:rest%
rule=tag:Amount: %N:number%
----

=== Normalization

Normalization is done by first initializing `LognormFactory` object with the rulebase given as an input argument for the constructor. The `lognorm()` method of the factory object is then used to provide a configured version of `JavaLognormImpl` object that is capable of normalizing messages through the `normalize()` method.

The rulebase can be provided to the `LognormFactory` constructor either as a `String` or as a `File` object. Rulebases in `String` format do not require the `version=2` indicator mentioned in liblognorm documentations.

Because of C memory management requirements, the `JavaLognormImpl` object must be closed when it is no longer needed. The typical use-case for normalization using try-with-resources is as follows:

[,java]
----
LognormFactory lognormFactory = new LognormFactory("rule=:%all:rest%");
try (JavaLognormImpl javaLognormImpl = lognormFactory.lognorm()) {
String normalizedMessage = javaLognormImpl.normalize("message to normalize");
}
----

If normalization was a success, `normalizedMessage` will hold the normalization result in JSON string format.

=== Additional configuration options

Liblognorm API has additional configuration options available for normalization:

. `LN_CTXOPT_ADD_ORIGINALMSG` — always adds original message to the normalization result.
. `LN_CTXOPT_ADD_RULE` — adds mockup rule to the normalization result.
. `LN_CTXOPT_ADD_RULE_LOCATION` — adds rule location (file, lineno) to the normalization result.

The configuration for the additional options is done by providing `OptionsStruct` object as an additional input argument for `LognormFactory` constructor.

[,java]
----
LibJavaLognorm.OptionsStruct opts = new LibJavaLognorm.OptionsStruct();
opts.CTXOPT_ADD_ORIGINALMSG = true;
LognormFactory lognormFactory = new LognormFactory(opts, "rule=:%all:rest%");
try (JavaLognormImpl javaLognormImpl = lognormFactory.lognorm()) {
String normalizedMessage = javaLognormImpl.normalize("message to normalize");
}
----

== Contributing

You can involve yourself with our project by https://github.com/teragrep/rsm_01/issues/new/choose[opening an issue] or submitting a pull request.

Contribution requirements:

. *All changes must be accompanied by a new or changed test.* If you think testing is not required in your pull request, include a sufficient explanation as why you think so.
. Security checks must pass
. Pull requests must align with the principles and http://www.extremeprogramming.org/values.html[values] of extreme programming.
. Pull requests must follow the principles of Object Thinking and Elegant Objects (EO).

Read more in our https://github.com/teragrep/teragrep/blob/main/contributing.adoc[Contributing Guideline].

=== Contributor License Agreement

Contributors must sign https://github.com/teragrep/teragrep/blob/main/cla.adoc[Teragrep Contributor License Agreement] before a pull request is accepted to organization's repositories.

You need to submit the CLA only once. After submitting the CLA you can contribute to all Teragrep's repositories.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/teragrep/rsm_01

Awesome Lists containing this project

README