Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/reyjrar/parse-syslog-line

Flexible library for parsing syslog messages in Perl
https://github.com/reyjrar/parse-syslog-line

logging perl

Last synced: about 1 month ago
JSON representation

Flexible library for parsing syslog messages in Perl

Awesome Lists containing this project

README

        

# NAME

Parse::Syslog::Line - Simple syslog line parser

# VERSION

version 5.3

# SYNOPSIS

I wanted a very simple log parser for network based syslog input.
Nothing existed that simply took a line and returned a hash ref all
parsed out.

use Parse::Syslog::Line qw(parse_syslog_line);

$Parse::Syslog::Line::DateTimeCreate = 1;
$Parse::Syslog::Line::AutoDetectJSON = 1;

my $href = parse_syslog_line( $msg );
#
# $href = {
# preamble => '13',
# priority => 'notice',
# priority_int => 5,
# facility => 'user',
# facility_int => 8,
# date => 'YYYY-MM-DD',
# time => 'HH::MM:SS',
# epoch => 1361095933,
# datetime_str => ISO 8601 datetime, $NormalizeToUTC = 1 then UTC, else local
# datetime_obj => undef, # If $DateTimeCreate = 1, else undef
# datetime_raw => 'Feb 17 11:12:13'
# date_raw => 'Feb 17 11:12:13'
# host_raw => 'hostname', # Hostname as it appeared in the message
# host => 'hostname', # Hostname without domain
# domain => 'blah.com', # if provided
# program_raw => 'sshd(blah)[pid]',
# program_name => 'sshd',
# program_sub => 'pam_unix',
# program_pid => 20345,
# content => 'the rest of the message'
# message => 'program[pid]: the rest of the message',
# message_raw => 'The message as it was passed',
# ntp => 'ok', # Only set for Cisco messages
# version => 1,
# SDATA => { ... }, # RFC Structured data, decoded JSON, or K/V Pairs in the message
# };
...

# EXPORT

Exported by default:
parse\_syslog\_line( $one\_line\_of\_syslog\_message );

Optional Exports:
:preamble
preamble\_priority
preamble\_facility

:constants
%LOG_FACILITY
%LOG_PRIORITY

:with_timezones
set_syslog_timezone
get_syslog_timezone
use_utc_syslog

# VARIABLES

## ExtractProgram

If this variable is set to 1 (the default), parse\_syslog\_line() will try it's
best to extract a "program" field from the input. This is the most expensive
set of regex in the module, so if you don't need that pre-parsed, you can speed
the module up significantly by setting this variable.

Vendors who do proprietary non-sense with their syslog formats are to blame for
this setting.

Usage:

$Parse::Syslog::Line::ExtractProgram = 0;

## DateParsing

If this variable is set to 0 raw date will not be parsed further into
components (datetime\_str date time epoch). Default is 1 (parsing enabled).

Usage:

$Parse::Syslog::Line::DateParsing = 0;

## DateTimeCreate

If this variable is set to 1 (the default), a DateTime object will be returned in the
$m->{datetime\_obj} field. Otherwise, this will be skipped.

NOTE: DateTime timezone calculation is fairly slow. Unless you really need to
take timezones into account, you're better off using other modes (below).

Usage:

$Parse::Syslog::Line::DateTimeCreate = 0;

## EpochCreate

If this variable is set to 1, the default, the number of seconds from UNIX
epoch will be returned in the $m->{epoch} field. Setting this to false will
only delete the epoch before returning the hash reference.

## NormalizeToUTC

When set, the datetime\_str will be ISO8601 UTC.

## OutputTimeZones

Default is false, but is enabled if you call set\_syslog\_timezone() or
use\_utc\_syslog(). If enabled, this will append the timezone offset to the
datetime\_str.

## FmtDate

You can pass your own formatter/parser here. Given a raw datetime string it
should output a list containing date, time, epoch, datetime\_str,
in your wanted format.

use Parse::Syslog::Line;

local $Parse::Syslog::Line::FmtDate = sub {
my ($raw_datestr) = @_;
my @elements = (
#date
#time
#epoch
#datetime_str
);
return @elements;
};

**NOTE**: No further date processing will be done, you're on your own here.

## HiResFmt

Default is `%0.6f`, or microsecond resolution. This variable only comes into
play when the syslog date string contains a high resolution timestamp. It
defaults to using microsecond resolution.

## AutoDetectJSON

Default is false. If true, we'll autodetect the presence of JSON in the syslog
message and use [JSON::MaybeXS](https://metacpan.org/pod/JSON%3A%3AMaybeXS) to decode it. The detection/decoding is
simple. If a '{' is detected, everything until the end of the message is
assumed to be JSON. The decoded JSON will be added to the `SDATA` field.

$Parse::Syslog::Line::AutoDetectJSON = 1;

## AutoDetectKeyValues

Default is false. If true, we'll autodetect the presence of Splunk style
key/value pairds in the message stream. That format is `k1=v1, k2=v2`.
Resulting K/V pairs will be added to the `SDATA` field.

$Parse::Syslog::Line::AutoDetectKeyValues = 1;

## RFC5424StructuredData

Default is true. When enabled, this will extract the RFC standard structured data
from the message content. That content will be stripped from the message
`content` field.

Some examples:

# Input
[foo x=1] some words [bar x=2]

# To (YAML for brevity)
---
SDATA:
bar:
x: 2
foo:
x: 1
content: some words

# Input
[x=1] some words

# To (YAML for brevity)
---
SDATA:
x: 1
content: some words

To disable:

$Parse::Syslog::Line::RFC5424StructuredData = 0;

## RFC5424StructuredDataStrict

Require the format:

[namespace@id property="value"][namespace@id property="value"]

Defaults to 0, set to 1 to only parse the RFC5424 formatted structured data.

## PruneRaw

This variable defaults to 0, set to 1 to delete all keys in the return hash
ending in "\_raw"

Usage:

$Parse::Syslog::Line::PruneRaw = 1;

## PruneEmpty

This variable defaults to 0, set to 1 to delete all keys in the return hash
which are undefined.

Usage:

$Parse::Syslog::Line::PruneEmpty = 1;

## PruneFields

This should be an array of fields you'd like to be removed from the hash reference.

Usage:

@Parse::Syslog::Line::PruneFields = qw(date_raw facility_int priority_int);

# FUNCTIONS

## parse\_syslog\_line

Returns a hash reference of syslog message parsed data.

**NOTE**: Date/time parsing is hard. This module has been optimized to balance
common sense and processing speed. Care is taken to ensure that any data input
into the system isn't lost, but with the varieties of vendor and admin crafted
date formats, we don't always get it right. Feel free to override date
processing using by setting the $FmtDate variable or completely disable it with
$DateParsing set to 0.

## set\_syslog\_timezone($timezone\_name)

Sets a timezone $timezone\_name for parsed messages. This timezone will be used
to calculate offset from UTC if a timezone designation is not present in the
message being parsed. This timezone will also serve as the source timezone for
the datetime\_str field.

## get\_syslog\_timezone

Returns the name of the timezone currently set by set\_syslog\_timezone.

## use\_utc\_syslog

A convenient function which sets the syslog timezone to UTC and sets the config
variables accordingly. Automatically sets $NormaizeToUTC and datetime\_str will
be set to the UTC equivalent.

## parse\_syslog\_lines

Returns a list of hashes of the lines interpretted.

When passed one or more line of text, attempts to parse that text as syslog data. This function
varies from `parse_syslog_line` in that it handles multi-line messages. The caveat to this, is
after the last iteration of the loop, you to call the function by itself to get the last message.

use strict;
use warnings;
use DDP;
use Parse::Syslog::Line qw(parse_syslog_lines);

while(<>) {
foreach my $log ( parse_syslog_lines($_) ) {
p($log);
}
}
p($_) for parse_syslog_lines();

This function holds a parsing buffer which it flushes any time it encounters a
line in the stream that starts with non-whitespace. Any lines beginning with
whitespace will be assumed to be a continuation of the previous line.

It is not exported by default.

## preamble\_priority

Takes the Integer portion of the syslog messsage and returns
a hash reference as such:

$prioRef = {
'preamble' => 13
'as_text' => 'notice',
'as_int' => 5,
};

## preamble\_facility

Takes the Integer portion of the syslog messsage and returns
a hash reference as such:

$facRef = {
'preamble' => 13
'as_text' => 'user',
'as_int' => 8,
};

# ENVIRONMENT VARIABLES

There are environment variables that affect how we operate. They are not
options as they are not intended to be used by our users. Use at your own risk.

## PARSE\_SYSLOG\_LINE\_DEBUG

Outputs debugging information about the parser, not really intended for end-users.

## PARSE\_SYSLOG\_LINE\_QUIET

Disables warnings in the parse\_syslog\_line() function

## TEST\_ACTIVE / TEST2\_ACTIVE

Disables warnings in the parse\_syslog\_line() function

# DEVELOPMENT

This module is developed with Dist::Zilla. To build from the repository, use Dist::Zilla:

dzil authordeps --missing |cpanm
dzil listdeps --missing |cpanm
dzil build
dzil test

# AUTHOR

Brad Lhotsky

# COPYRIGHT AND LICENSE

This software is Copyright (c) 2017 by Brad Lhotsky.

This is free software, licensed under:

The (three-clause) BSD License

# CONTRIBUTORS

- Bartłomiej Fulanty
- Csillag Tamas
- Keedi Kim
- Mateu X Hunter
- Neil Bowers
- Shawn Wilson
- Tomohiro Hosaka

# SUPPORT

## Websites

The following websites have more information about this module, and may be of help to you. As always,
in addition to those websites please use your favorite search engine to discover more resources.

- MetaCPAN

A modern, open-source CPAN search engine, useful to view POD in HTML format.

[https://metacpan.org/release/Parse-Syslog-Line](https://metacpan.org/release/Parse-Syslog-Line)

## Bugs / Feature Requests

This module uses the GitHub Issue Tracker: [https://github.com/reyjrar/Parse-Syslog-Line/issues](https://github.com/reyjrar/Parse-Syslog-Line/issues)

## Source Code

This module's source code is available by visiting:
[https://github.com/reyjrar/Parse-Syslog-Line](https://github.com/reyjrar/Parse-Syslog-Line)