{"id":19482524,"url":"https://github.com/git-ogawa/pylogcounter","last_synced_at":"2025-08-03T23:38:43.485Z","repository":{"id":103898425,"uuid":"586462140","full_name":"git-ogawa/pylogcounter","owner":"git-ogawa","description":" CLI for checking timely lines and size of a log file. ","archived":false,"fork":false,"pushed_at":"2023-01-14T08:51:03.000Z","size":61,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-08T18:47:09.956Z","etag":null,"topics":["log","python3"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/git-ogawa.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-01-08T08:14:09.000Z","updated_at":"2023-01-09T10:18:26.000Z","dependencies_parsed_at":null,"dependency_job_id":"345b5bf3-f2b9-4165-8ac8-c191f2b50b0f","html_url":"https://github.com/git-ogawa/pylogcounter","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/git-ogawa%2Fpylogcounter","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/git-ogawa%2Fpylogcounter/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/git-ogawa%2Fpylogcounter/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/git-ogawa%2Fpylogcounter/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/git-ogawa","download_url":"https://codeload.github.com/git-ogawa/pylogcounter/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240718428,"owners_count":19846478,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["log","python3"],"created_at":"2024-11-10T20:10:56.183Z","updated_at":"2025-02-25T17:41:38.204Z","avatar_url":"https://github.com/git-ogawa.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# pylogcounter\n[![LICENSE](https://img.shields.io/github/license/git-ogawa/pylogcounter?style=plastic)](https://github.com/git-ogawa/pylogcounter/blob/main/LICENSE)\n[![Version](https://img.shields.io/pypi/v/pylogcounter?style=plastic)](https://pypi.python.org/pypi/pylogcounter/)\n![Python versions](https://img.shields.io/pypi/pyversions/pylogcounter?style=plastic)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/ambv/black)\n[![Downloads](https://static.pepy.tech/badge/pylogcounter)](https://pepy.tech/project/pylogcounter)\n\npylogcounter is a simple, compact CLI to check lines and size of a log file. It is also possible to resample the contents based on timestamps and show the mean of lines and size per minute, hour and etc.\n\nThe tool will be useful when you want to check log size and loglevel trends before monitoring and visualizing logs on large-scale platforms such as elasticsearch.\n\n\n# Table of Contents\n- [pylogcounter](#pylogcounter)\n- [Table of Contents](#table-of-contents)\n- [Install](#install)\n- [Usage](#usage)\n  - [Simple usage](#simple-usage)\n  - [Run on docker](#run-on-docker)\n  - [Custom timestamp format](#custom-timestamp-format)\n  - [Show details](#show-details)\n  - [Output to csv](#output-to-csv)\n  - [Custom time interval](#custom-time-interval)\n  - [Loglevel Classification](#loglevel-classification)\n\n\n# Install\nInstall with pip.\n```\n$ pip install pylogcounter\n```\n\nAlternatively, you can pull a docker image from [dockerhub](https://hub.docker.com/r/docogawa/pylogcounter).\n```\n# docker pull docogawa/pylogcounter\n```\n\n# Usage\nYou need a log file to be parsed that meets the following requirements.\n\n- A file must include a timestamp per line.\n- multiline log is not supported.\n- The format of timestamp must be one of the following (See [Custom timestamp format](#custom-timestamp-format) if you use your own format).\n\n| Type | Format | Example |\n| - | - | - |\n| ISO 8601 | %Y-%m-%dT%H:%M:%S.%fZ | 2022-05-18T11:40:22.519222Z |\n| RFC 3164 | %b %m %H:%M:%S | Jan  7 16:55:01  |\n| General format | %Y-%m-%d %H:%M:%S | 2023-01-08 11:15:24 |\n\n\n## Simple usage\nRun `pylogcounter` with `-f` option to pass the path to the log file. The results are shown on the stdout.\n\n```\n$ pylogcounter -f syslog\nKind : Total\n--------------------------------------------------------------------------------\nItem                        | Value                       | Unit\n--------------------------------------------------------------------------------\nStart time                  | Jul 07 00:00:01             |\nEnd time                    | Jul 07 16:55:07             |\nElapsed time                | 60906.0                     |\nTotal line                  | 2683                        | Line\nTotal bytes                 | 2353790.0                   | Byte\nMean line                   | 1.0                         | Line\nMean bytes                  | 877.298                     | Byte\n--------------------------------------------------------------------------------\n\nKind : Second\n--------------------------------------------------------------------------------\nItem                        | Value                       | Unit\n--------------------------------------------------------------------------------\nStart time                  | Jul 07 00:00:01             |\nEnd time                    | Jul 07 16:55:07             |\nMean line                   | 0.044                       | Line/sec\nMean bytes                  | 38.646                      | Byte/sec\n--------------------------------------------------------------------------------\n\nKind : Minutely\n--------------------------------------------------------------------------------\nItem                        | Value                       | Unit\n--------------------------------------------------------------------------------\nStart time                  | Jul 07 00:00:01             |\nEnd time                    | Jul 07 16:55:01             |\nMean line                   | 2.641                       | Line/min\nMean bytes                  | 2316.722                    | Byte/min\n--------------------------------------------------------------------------------\n\nKind : Hourly\n--------------------------------------------------------------------------------\nItem                        | Value                       | Unit\n--------------------------------------------------------------------------------\nStart time                  | Jul 07 00:00:01             |\nEnd time                    | Jul 07 16:00:01             |\nMean line                   | 157.824                     | Line/hour\nMean bytes                  | 138458.235                  | Byte/hour\n--------------------------------------------------------------------------------\n\nKind : Daily\n--------------------------------------------------------------------------------\nItem                        | Value                       | Unit\n--------------------------------------------------------------------------------\nStart time                  | Jul 07 00:00:01             |\nEnd time                    | Jul 07 00:00:01             |\nMean line                   | 2683.0                      | Line/day\nMean bytes                  | 2353790.0                   | Byte/day\n--------------------------------------------------------------------------------\n```\n\nIn the total section (Table below `kind: Total`), the overall result of the log file is displayed. See below for details on each item.\n\n| Item | Description |\n| - | - |\n| Start time | The start time of log. |\n| End time | The end time of log. |\n| Elapsed time | The difference between the start and the end [sec]. |\n| Total line | The number of lines in log. |\n| Total bytes | Total bytes in log. |\n| Mean line | The mean lines (equals to `total_line/elapsed_time`). |\n| Mean bytes | The mean number of bytes per line (equals to `total_bytes/total_lines`). |\n\n\nIn each time section, resampled statistics are output for each period. For example, in the minutely section, start time and end time means the start and end time of the log resampled in minutes.\n\n\nWith `--only` option, show only the result of the specified time range. To display multiple time range, separate them by `,` such as `--only s,m,h`.\n\n```\n$ pylogcounter -f syslog --only s,h\nKind : Total\n--------------------------------------------------------------------------------\nItem                        | Value                       | Unit\n--------------------------------------------------------------------------------\nStart time                  | Jul 07 00:00:01             |\nEnd time                    | Jul 07 16:55:07             |\nElapsed time                | 60906.0                     |\nTotal line                  | 2683                        | Line\nTotal bytes                 | 2353790.0                   | Byte\nMean line                   | 1.0                         | Line\nMean bytes                  | 877.298                     | Byte\n--------------------------------------------------------------------------------\n\nKind : Second\n--------------------------------------------------------------------------------\nItem                        | Value                       | Unit\n--------------------------------------------------------------------------------\nStart time                  | Jul 07 00:00:01             |\nEnd time                    | Jul 07 16:55:07             |\nMean line                   | 0.044                       | Line/sec\nMean bytes                  | 38.646                      | Byte/sec\n--------------------------------------------------------------------------------\n\nKind : Hourly\n--------------------------------------------------------------------------------\nItem                        | Value                       | Unit\n--------------------------------------------------------------------------------\nStart time                  | Jul 07 00:00:01             |\nEnd time                    | Jul 07 16:00:01             |\nMean line                   | 157.824                     | Line/hour\nMean bytes                  | 138458.235                  | Byte/hour\n--------------------------------------------------------------------------------\n```\n\nUse `-o yaml` in order to show the result in yaml syntax.\n\n```yaml\n$ pylogcounter -f tmp/syslog -o yaml\ntotal:\n  start_time: Jul 07 00:00:01\n  end_time: Jul 07 16:55:07\n  elapsed_time: 60906.0\n  total_lines: 2683\n  total_bytes: 2353790.0\n  mean_lines: 1.0\n  mean_bytes: 877.298\n  byte_unit: byte\nsecond:\n  start_time: Jul 07 00:00:01\n  end_time: Jul 07 16:55:07\n  mean_lines: 0.044\n  mean_bytes: 38.646\n  byte_unit: byte\nminutely:\n  start_time: Jul 07 00:00:01\n  end_time: Jul 07 16:55:01\n  mean_lines: 2.641\n  mean_bytes: 2316.722\n  byte_unit: byte\nhourly:\n  start_time: Jul 07 00:00:01\n  end_time: Jul 07 16:00:01\n  mean_lines: 157.824\n  mean_bytes: 138458.235\n  byte_unit: byte\ndaily:\n  start_time: Jul 07 00:00:01\n  end_time: Jul 07 00:00:01\n  mean_lines: 2683.0\n  mean_bytes: 2353790.0\n  byte_unit: byte\n```\n\nTo change the unit of bytes, specify the byte prefix in the `-b` option. All byte units in the result are replaced by the specified unit.\n\n```\n$ pylogcounter -f tmp/syslog --only m -b m\nind : Total\n--------------------------------------------------------------------------------\nItem                        | Value                       | Unit\n--------------------------------------------------------------------------------\nStart time                  | Jul 07 00:00:01             |\nEnd time                    | Jul 07 16:55:07             |\nElapsed time                | 60906.0                     |\nTotal line                  | 2683                        | Line\nTotal bytes                 | 2.245                       | MB\nMean line                   | 1.0                         | Line\nMean bytes                  | 0.001                       | MB\n--------------------------------------------------------------------------------\n\nKind : Minutely\n--------------------------------------------------------------------------------\nItem                        | Value                       | Unit\n--------------------------------------------------------------------------------\nStart time                  | Jul 07 00:00:01             |\nEnd time                    | Jul 07 16:55:01             |\nMean line                   | 2.641                       | Line/min\nMean bytes                  | 0.002                       | MB/min\n--------------------------------------------------------------------------------\n```\n\n## Run on docker\nIf you use pylogcounter using docker image, run the following commands.\n\n- Mount the directory including a log to be parsed to `/work`. the following example mounts the current directory to /work, so pass `/work/[logfile]` to `-f`.\n\n\n```\n# docker run -it --rm -v ${PWD}:/work docogawa/pylogcounter -f /work/syslog\n```\n\n## Custom timestamp format\nTo parse the timestamp that is not in the table in [Usage](#usage), specify your custom timestamp format with  `-t`. For example, timestamp `2023-01-08T09:59:10.197397+0000` can be parsed by the directive `%Y-%m-%dT%H:%M:%S.%f%z`, so you should use `pylogcounter -t \"%Y-%m-%dT%H:%M:%S.%f%z\"`.\n\n```\n# custom.log\n[2023-01-08T09:59:10.197397+0000] error test\n[2023-01-08T09:59:33.197397+0000] warn test\n[2023-01-08T09:59:35.197397+0000] info test\n[2023-01-08T09:59:55.197397+0000] info test\n[2023-01-08T10:00:22.197397+0000] debug test\n...\n\n$ pylogcounter -f custom.log -t \"%Y-%m-%dT%H:%M:%S.%f%z\" --only s\n----------------------------------------------------------------------------------------------------\nItem                                | Value                               | Unit\n----------------------------------------------------------------------------------------------------\nStart time                          | 2023-01-08T09:59:10.197397+0000     |\nEnd time                            | 2023-01-08T14:20:34.197397+0000     |\nElapsed time                        | 15684.0                             |\nTotal line                          | 1000                                | Line\nTotal bytes                         | 44500.0                             | Byte\nMean line                           | 1.0                                 | Line\nMean bytes                          | 44.5                                | Byte\n----------------------------------------------------------------------------------------------------\n\nKind : Second\n----------------------------------------------------------------------------------------------------\nItem                                | Value                               | Unit\n----------------------------------------------------------------------------------------------------\nStart time                          | 2023-01-08T09:59:10.197397+0000     |\nEnd time                            | 2023-01-08T14:20:34.197397+0000     |\nMean line                           | 0.064                               | Line/sec\nMean bytes                          | 2.837                               | Byte/sec\n----------------------------------------------------------------------------------------------------\n```\n\n## Show details\nThe `-v` option show the max, min, standard deviation and 50 % of the mean size and lines in addition to the result. This may be useful to see trends when the log output varies widely over time.\n\n```\nKind : Minutely\n----------------------------------------------------------------------------------------------------\nItem                                | Value                               | Unit\n----------------------------------------------------------------------------------------------------\nStart time                          | Jul 07 00:00:01                     |\nEnd time                            | Jul 07 16:55:01                     |\nMean line                           | 2.641                               | Line/min\nMean line max                       | 93.0                                | Line/min\nMean line min                       | 2.0                                 | Line/min\nMean line std                       | 4.72                                | Line/min\nMean line 50%                       | 2.0                                 | Line/min\nMean bytes                          | 2316.722                            | Byte/min\nMean bytes max                      | 19370.0                             | Byte/min\nMean bytes min                      | 291.0                               | Byte/min\nMean bytes std                      | 729.233                             | Byte/min\nMean bytes 50%                      | 2236.5                              | Byte/min\n----------------------------------------------------------------------------------------------------\n```\n\n\n## Output to csv\nRun pylogcounter with `-c` option to write resampled data in each time range to csv. The results are stored in `pylogcounter_csv`.\n\n```\npylogcounter_csv\n├── daily.csv\n├── hourly.csv\n├── minutely.csv\n├── second.csv\n└── total.csv\n```\n\nThe csv has the following columns. The data will be useful when you analyze the log in your way.\n\n| Column | Description |\n| - | - |\n| timestamp | Resampled timestamp |\n| bytes | The sum of bytes included in the resampled time range. |\n| lines | The sum of lines included in the resampled time range. |\n\n```\n# minutely.csv\ntimestamp,bytes,line\n2023-01-08 04:26:28.141265,235,6\n2023-01-08 04:27:28.141265,118,3\n2023-01-08 04:28:28.141265,159,4\n2023-01-08 04:29:28.141265,118,3\n2023-01-08 04:30:28.141265,118,3\n2023-01-08 04:31:28.141265,118,3\n2023-01-08 04:32:28.141265,79,2\n```\n\n## Custom time interval\nRun pylogcounter with `-i [interval]` to resample a log at the specified time interval. The following example resamples a log `syslog` every 15 minutes and output the mean of lines and bytes per 15 minutes as `Mean line`, `Mean bytes` respectively in the custom section.\n\n```\n$ pylogcounter -f syslog  -i 15m\nKind : Total\n----------------------------------------------------------------------------------------------------\nItem                                | Value                               |Unit\n----------------------------------------------------------------------------------------------------\nStart time                          | Jul 07 00:00:01                     |\nEnd time                            | Jul 07 16:55:07                     |\nElapsed time                        | 60906.0                             |\nTotal line                          | 2683                                | Line\nTotal bytes                         | 2353790.0                           | Byte\nMean line                           | 1.0                                 | Line\nMean bytes                          | 877.298                             | Byte\n----------------------------------------------------------------------------------------------------\n\nKind : Custom\n----------------------------------------------------------------------------------------------------\nItem                                | Value                               |Unit\n----------------------------------------------------------------------------------------------------\nStart time                          | Jul 07 00:00:01                     |\nEnd time                            | Jul 07 16:45:01                     |\nMean line                           | 39.456                              | Line\nMean bytes                          | 34614.559                           | Byte\n----------------------------------------------------------------------------------------------------\n```\n\nThe time interval must be `[digit]:[unit_prefix]`. The valid unit prefixes are the following.\n\n| Prefix | unit | Example |\n| - | - | - |\n| s, sec | second | 10s, 10sec |\n| m, min | minute | 5m, 5min |\n| h, hour | hour | 2h, 2hour |\n| d, day | day | 1d, 1day |\n| w, week | week | 1w, 1week |\n| M, month | month | 1M, 1month |\n\n\nIf you want to check that the data are correctly resampled at the specified time interval, run pylogcounter with `-c` to write the processed data to `pylogcounter_csv/custom.csv`. You find that the timestamp columns are separated by the time (per 15 minutes in the example below).\n\n```\n# pylogcounter_csv/custom.csv\ntimestamp,bytes,line\n2023-01-13 18:10:43.960345,4238,62\n2023-01-13 18:25:43.960345,3860,58\n2023-01-13 18:40:43.960345,3765,55\n2023-01-13 18:55:43.960345,4141,62\n2023-01-13 19:10:43.960345,4374,67\n...\n```\n\n## Loglevel Classification\nIf a log have messages with loglevel per line, you can count the mean by running pylogcounter with `--log`  and `-o yaml`. The pylogcounter detect and parse the message with loglevel per line automatically. The loglevel that can be parsed are the following.\n\n- alert\n- debug\n- notice\n- info\n- warn\n- error\n- critical\n- fatal\n\nThe results are shown under `log` field in the total and each time section.  In the total section, the fields show the mean of messages with the loglevel in a log. For example, debug `0.256` below means that the ratio of the messages with debug level to total lines `1000` is 0.256. Thus 256 of the 1000 lines are debug messages.\n\nSimilarly in the minutely section, the field show the mean of messages with log level per minute.\n\n```yaml\n$ pylogcounter -f test.log --log -o yaml --only m\ntotal:\n  start_time: '2023-01-08T03:25:08.897592Z'\n  end_time: '2023-01-08T03:41:47.897592Z'\n  elapsed_time: 999.0\n  total_lines: 1000\n  total_bytes: 39508.0\n  mean_lines: 1.0\n  mean_bytes: 39.508\n  byte_unit: byte\n  log:\n    alert: 0.0\n    debug: 0.256\n    notice: 0.0\n    info: 0.253\n    warn: 0.239\n    error: 0.252\n    critical: 0.0\n    fatal: 0.0\nminutely:\n  start_time: '2023-01-08T03:25:08.897592Z'\n  end_time: '2023-01-08T03:41:08.897592Z'\n  lines: 58.824\n  bytes: 2324.0\n  byte_unit: byte\n  log:\n    alert: 0.0\n    debug: 15.059\n    notice: 0.0\n    info: 14.882\n    warn: 14.059\n    error: 14.824\n    critical: 0.0\n    fatal: 0.0\n```\n\nRun with `-c` to write the number of loglevels in each time range after being resampled to csv for detailed analysis. The csv contains loglevel columns in addition to the standard columns (timestamp, bytes, line). The log_total_count columns is the sum of all loglevels (`log_total_count = alert + debug + ... + fatal`).\n\nNote: The each row in the csv show the **sum** of messages with the loglevels per time, not the **mean**.\n```\n# pylogcounter_csv/daily.csv\ntimestamp,bytes,line,alert,debug,notice,info,warn,error,critical,fatal,log_total_count\n2023-01-13 18:10:43.960345,378911,5627,663,686,681,729,742,677,754,695,5627\n2023-01-14 18:10:43.960345,377875,5634,720,732,694,670,701,717,723,677,5634\n2023-01-15 18:10:43.960345,370591,5514,729,691,708,699,665,667,667,688,5514\n2023-01-16 18:10:43.960345,376211,5580,658,688,677,713,702,748,722,672,5580\n2023-01-17 18:10:43.960345,375614,5587,687,657,699,710,728,708,702,696,5587\n2023-01-18 18:10:43.960345,372907,5526,674,727,662,735,689,691,711,637,5526\n2023-01-19 18:10:43.960345,374474,5567,698,711,699,689,689,721,665,695,5567\n```\n\nyou can use the data to visualize how many messages the log include per loglevel at the time, or to see how them change over time as shown in the figure below.\n\n![](docs/images/count_daily_log.png)","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgit-ogawa%2Fpylogcounter","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgit-ogawa%2Fpylogcounter","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgit-ogawa%2Fpylogcounter/lists"}