https://github.com/perryflynn/thebestlogparserintheworld
The best webserver log parser in the world.
https://github.com/perryflynn/thebestlogparserintheworld
access-logs analytics apache log-parser nginx statistics
Last synced: about 1 month ago
JSON representation
The best webserver log parser in the world.
- Host: GitHub
- URL: https://github.com/perryflynn/thebestlogparserintheworld
- Owner: perryflynn
- License: mit
- Created: 2020-05-18T05:20:44.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2024-12-21T19:59:24.000Z (over 1 year ago)
- Last Synced: 2025-10-08T15:00:05.909Z (8 months ago)
- Topics: access-logs, analytics, apache, log-parser, nginx, statistics
- Language: C#
- Homepage:
- Size: 51.8 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# tHE bEST lOG pARSER iN tHE wORLD
A access log parser to create basic statistics from huge log files.
## Analyze Benchmark
4 Cores, 16 GB RAM:
```md
## Single Thread
Took 4,810.503 seconds (80.167 minutes)
Processed 14 files
Processed 445,052,807 lines
Processed 92,516.897 lines per second
## 3 Threads
Took 2,439.198 seconds (40.653 minutes)
Processed 32 files
Processed 445,289,011 lines
Processed 182,555.498 lines per second
```
## Run
This tool is developed with .NET Core.
To run it, the .NET Core Runtime Environment is required:
```sh
dotnet run -- --help
```
Or it must build in self-contained mode:
```sh
mkdir dist
dotnet publish -c Release -r linux-x64 --self-contained -o dist/
cd dist
./logsplit --help
```
## State of development
Early alpha, this tool is just for nerds at the moment.
## Quickstart
### Init repository
```sh
mkdir /home/christian/weblogs
logsplit init -d /home/christian/weblogs -f examplewebsite
cd /home/christian/weblogs
```
(See `logsplit init --help` for more info)
### Put logs into import folder
Now you can put all log files from the website
`examplewebsite` into `/home/christian/weblogs/input/examplewebsite`.
Inside of the folder is also a `loginfo.json`, which contains the configuration
for parsing the access logs. This should fit for NGINX logs when the filename
format is something like `example-access.log.1.gz`.
Just edit the JSON file if something is not working.
### Import
The import splits the logfiles into one access log per
host, per hostgroup, per month.
```sh
cd /home/christian/weblogs
logsplit import
```
When something goes wrong while importing, just delete all
`*.new` files in `/home/christian/weblogs/repository` and try again.
(See `logsplit import --help` for more info)
### Analyze
The analyze process parses the log files and generates a summary JSON
file which can be used to generate the actual statistics.
After this process, the raw accesslogs are not needed anymore.
```
cd /home/christian/weblogs
logsplit analyze
```
(See `logsplit analyze --help` for more info)
### Create statistics
This module is in a very early state.
Maybe C# Skills are required to get the infos that you desire.
```sh
logsplit statistic -p "-access_log_examplewebsite-"
```
The `-p` parameter is a regular expression which must match a `.gz.json`
file in the repository folder.