{"id":20246221,"url":"https://github.com/rkoschmitzky/logmole","last_synced_at":"2025-10-30T16:03:06.896Z","repository":{"id":31872367,"uuid":"123640087","full_name":"rkoschmitzky/logmole","owner":"rkoschmitzky","description":"An Extendable and Versatile Logparsing System","archived":false,"fork":false,"pushed_at":"2022-12-08T09:30:54.000Z","size":94,"stargazers_count":3,"open_issues_count":1,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-06-12T20:17:52.817Z","etag":null,"topics":["conversions","json","log","logs","parser","parsing-library","pattern-matching","python","regex-pattern","regular-expressions"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rkoschmitzky.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-03-02T23:05:10.000Z","updated_at":"2022-01-31T23:32:43.000Z","dependencies_parsed_at":"2023-01-14T19:58:14.074Z","dependency_job_id":null,"html_url":"https://github.com/rkoschmitzky/logmole","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/rkoschmitzky/logmole","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rkoschmitzky%2Flogmole","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rkoschmitzky%2Flogmole/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rkoschmitzky%2Flogmole/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rkoschmitzky%2Flogmole/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rkoschmitzky","download_url":"https://codeload.github.com/rkoschmitzky/logmole/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rkoschmitzky%2Flogmole/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":264419453,"owners_count":23605198,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["conversions","json","log","logs","parser","parsing-library","pattern-matching","python","regex-pattern","regular-expressions"],"created_at":"2024-11-14T09:27:54.004Z","updated_at":"2025-10-30T16:03:06.833Z","avatar_url":"https://github.com/rkoschmitzky.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Build Status](https://travis-ci.com/rkoschmitzky/logmole.svg?branch=master)](https://travis-ci.com/rkoschmitzky/logmole) [![Coverage Status](https://coveralls.io/repos/github/rkoschmitzky/logmole/badge.svg?branch=master)](https://coveralls.io/github/rkoschmitzky/logmole?branch=master) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![Downloads](https://pepy.tech/badge/logmole)](https://pepy.tech/project/logmole)\n# Logmole\n\n## An Extendable and Versatile Logparsing System\n\nLogmole allows you dealing with regex pattern chaining in a simple way to create extensions for different types of\nlog files.\n\n### Table of Contents\n\n- [Project Goals](#what-can-it-do-for-you)\n- [Installation](#installation)\n- [How to use](#how-to-use)\n  - [LogContainer](#the-logcontainer)\n  - [Patterns](#patterns)\n  - [Grouping Containers](#grouping-containers)\n  - [Assumptions](#assumptions)\n    - [Native Type Assumptions](#native-type-assumptions)\n    - [Custom Type Assumptions](#custom-type-assumptions)\n    - [Custom Types](#custom-types)\n- [Extensions](#extensions)\n\n### What can it do for you?\n- provide a framework to create reusable and modular logparsers based on regular expressions\n- simplify the process of chaining multiple regex patterns\n- dynamic object and fields creation based on named capturing groups and representatives\n- help with automatic and robust type conversions\n- offer some pre-build extensions\n\n-----\n\n### Installation\n\nLogmole can be installed via `pip`.\n```bash\npip install logmole\n```\n\n### How to use\n\n##### The LogContainer\n\nThe LogContainer class is a component that represents the content of a regex pattern or patterns of its sub-containers.\n\n| Attribute               | Type     | Description\n|:------------------------|:---------|:------------\n| `pattern    `           | `str`    | The regex pattern a container will use to parse the data. Be aware that you always have to provide a named capturing group. Each match on the named group will end up as its own attribute on the container or the declared container representative.\n| `representative`        | `str`    | A name that represents one or multiple containers and defines where to store a containers matched data.\n| `sub_containers`        | `str`    | Defines the association of a container with child containers.\n| `assumptions`           | sublcass of `BaseAssumptions` | An assumptions object to declare actions on matched data.\n| `infer_type`            | `bool`   | If True (default) it will use the declared assumptions to convert the type of a match automatically.\n\n| Methods                             | Returns  | Description\n|:------------------------------------|:---------|:------------\n| `dump(filepath=str, **kwargs)`      | `None`   | Serialize LogContainer representation as a JSON formatted stream to the given filepath. Uses the same signature as json.dump()\n| `get_value(str)`                    | `str`    | Get the value of an attribute using a dot separated like `foo.bar.foobar`\n\n----\n\n#### Understand By Example\n\nLets have a look at some examples to demonstrate the main concepts.\n\n\nInput Log Content\n```bash\n19:22:40 | INFO     line 8 in \u003cmodule\u003e | Movie started\n19:22:40 | WARNING  line 10 in \u003cmodule\u003e | Found 10000 ghosts\n19:22:41 | DEBUG    line 12 in \u003cmodule\u003e | Scene contains 3 Monsters\n19:22:43 | DEBUG    line 13 in \u003cmodule\u003e | Scene contains 1 Girl\n19:22:46 | INFO     line 14 in \u003cmodule\u003e | Movie ends\n```\n\n\u003cbr\u003e\n\n###### Patterns\n\nAssume our extension only includes a pattern like this that shall provide the start end end time.\n```python\n\nfrom logmole import LogContainer\n\nclass MovieLog(LogContainer):\n    pattern = \"(?P\u003cstart_time\u003e.\\d+\\:\\d+:\\d+).*started|(?P\u003cend_time\u003e.\\d+\\:\\d+:\\d+).*ends\"\n```\n```python\n\u003e\u003e\u003e log = MovieLog(\"/tmp/some.log\")\n\u003e\u003e\u003e print(log)\n\n{\n    \"end_time\": \"19:22:46\",\n    \"start_time\": \"19:22:40\"\n}\n```\n\nThe LogContainer gets represented as prettified dictionary. But contrary to that you can use it as object that holds attributes for each capturing group.\n```python\n\u003e\u003e\u003e print(log.start_time)\n\u003e\u003e\u003e print(log.end_time)\n\n19:22:40\n19:22:46\n```\n\n\u003cbr\u003e\n\n###### Grouping Containers\n\nInstead of dealing with naming conventions categorize your matches you can define a representative for them.\nThis doesn't makes sense necessarily if you are working with a small amount of containers, but it will help when creating more complex nestings.\n```python\nclass TimesContainer(LogContainer):\n    pattern = \"(?P\u003cstart\u003e.\\d+\\:\\d+:\\d+).*started|(?P\u003cend\u003e.\\d+\\:\\d+:\\d+).*ends\"\n    representative = \"times\"\n\n\nclass MovieLog(LogContainer):\n    sub_containers = [TimesContainer]\n```\n\n```python\n\u003e\u003e\u003e log = MovieLog(\"/tmp/some.log\")\n\u003e\u003e\u003e print(log)\n\u003e\u003e\u003e print(\"-\"*10)\n\u003e\u003e\u003e print(log.times.start)\n\u003e\u003e\u003e print(log.times.end)\n\n{\n    \"times\": {\n        \"start\": \"19:22:40\",\n        \"end\": \"19:22:46\"\n    }\n}\n----------\n19:22:40\n19:22:46\n```\nAs you can see it will create a parent representative and attaches the matches to it.\n\n\u003cbr\u003e\n\nGrouping of containers only makes sense if you use the representative, right?\n```python\nclass GhostsContainer(LogContainer):\n    pattern = r\"(?P\u003cspooky_ghosts\u003e\\d+)\\s+ghosts?\"\n    representative = \"scene\"\n\n\nclass EntitiesContainer(LogContainer):\n    pattern = r\"contains\\s(?P\u003centities\u003e\\d+\\s.*)\"\n    representative = \"scene\"\n\n\nclass TimesContainer(LogContainer):\n    pattern = r\"(?P\u003cstart\u003e.\\d+\\:\\d+:\\d+).*started|(?P\u003cend\u003e.\\d+\\:\\d+:\\d+).*ends\"\n    representative = \"times\"\n\n\nclass MovieLog(LogContainer):\n    sub_containers = [\n        TimesContainer,\n        GhostsContainer,\n        EntitiesContainer\n    ]\n```\n\n```\n\u003e\u003e\u003e log = MovieLog(\"/tmp/some.log\")\n\u003e\u003e\u003e print log\n\n{\n    \"scene\": {\n        \"entities\": [\n            \"3 Monsters\",\n            \"1 Girl\"\n        ],\n        \"spooky_ghosts\": 10000\n    },\n    \"times\": {\n        \"start\": \"19:22:40\",\n        \"end\": \"19:22:46\"\n    }\n}\n```\n\n\u003cbr\u003e\n\nBut this doesn't mean that a sub container can't have its own sub containers.\nRewriting the extension to look like this would give us the same result.\nYou are flexible how to stack and layer your containers.\n```python\nclass GhostsContainer(LogContainer):\n    pattern = r\"(?P\u003cspooky_ghosts\u003e\\d+)\\s+ghosts?\"\n\n\nclass EntitiesContainer(LogContainer):\n    pattern = r\"contains\\s(?P\u003centities\u003e\\d+\\s.*)\"\n\n\nclass SceneContainer(LogContainer):\n    sub_containers = [\n        GhostsContainer,\n        EntitiesContainer\n    ]\n    representative = \"scene\"\n\n\nclass TimesContainer(LogContainer):\n    pattern = r\"(?P\u003cstart\u003e.\\d+\\:\\d+:\\d+).*started|(?P\u003cend\u003e.\\d+\\:\\d+:\\d+).*ends\"\n    representative = \"times\"\n\n\nclass MovieLog(LogContainer):\n    sub_containers = [\n        TimesContainer,\n        SceneContainer\n]\n```\n\n\u003cbr\u003e\n\n###### Assumptions\n\nAn Assumptions object defines a set of regex patterns and associates them with actions that gets\ncalled in case there is a match.\n\nTake a look back at the created output again:\n```\n{\n    \"scene\": {\n        \"entities\": [\n            \"3 Monsters\",\n            \"1 Girl\"\n        ],\n        \"spooky_ghosts\": 10000\n    },\n    \"times\": {\n        \"start\": \"19:22:40\",\n        \"end\": \"19:22:46\"\n    }\n}\n```\nNotice that the `scene.spooky_ghosts` entry is not a string anymore. This is because the\n`logmole.LogContainer.assumptions` assigns a default `logmole.TypeAssumptions` object\nthat handles simple conversions automatically.\n\n---\n\n##### Native Type Assumptions\n\nAs long as `infer_type ` is set to `True` the LogContainer will always try to convert native\ntypes.\n\nThis includes support for:\n\n- `int`: `^(\\-?\\d+)$`\n- `float`: `^(\\-?\\d+\\.\\d+)$`\n- `None`: `((N|n)one)$|^NONE$|^((N|n)ull)$|^NULL$|^((N|n)il)$|^NIL$`\n\n\n\n----\n\n\nYou can define whether your container should infer the type or not and disable it by setting\n[`infer_type`](#the-logcontainer) to `False`. This only applies to the container itself and doesn't get inherited from\nparent containers.\nFind out more about [native type assumptions](#native-type-assumptions):\n\n---\n\n##### Custom Type Assumptions\n\nYou can also extend existing assumptions or create an individual set of assumptions per container.\nLets demonstrate this on our `TimesContainer` using a custom available [`TimeType`](#timetype) object.\n```python\nfrom logmole import (\n    TypeAssumptions,\n    TimeType\n)\n```\n\n```python\nclass TimesContainer(LogContainer):\n    assumptions = TypeAssumptions({\".*\": TimeType()})\n    pattern = r\"(?P\u003cstart\u003e.\\d+\\:\\d+:\\d+).*started|(?P\u003cend\u003e.\\d+\\:\\d+:\\d+).*ends\"\n    representative = \"times\"\n```\n\n```python\n\u003e\u003e\u003e log = MovieLog(\"/tmp/some.log\")\n\u003e\u003e\u003e print(type(log.times.start))\n\u003ctype 'datetime.time'\u003e\n```\n\nA `TypeAssumptions` class has to be initialized with a dictionary defining patterns and their corresponding types.\nIn our case we can expect that everything that was matched by our `TimesContainer.pattern` before will be\na string of a valid `H:M:S` format. So we don't need a more precise pattern within our TypeAssumptions and can expect\nthose string would always fulfill the criteria to be convertable by our [`TimeType`](#timetype) object.\nThe `TypeAssumptions` class always allows us to inherit existing assumptions from parent containers. This is set by default.\nYou can ignore parent assumptions when initializing the `TypeAssumptions` class using `inherit=False`.\nThis way you can avoid potential match conflicts when using more sloppy patterns.\n\nBut generally spoken your patterns should be as precise as possible when using them on containers that hold a bunch\nof sub-containers.\n\n----\n\n\n#### Custom Types\n\nNative Type conversions might not be sufficient enough for you. There might be cases where you want to convert\nyour extracted information to a more specific type. There are custom types that can help you doing that or you\ncan write your own.\n\n##### KeyValueType\n\n**TO BE CONTINUED**\n\n\n##### TimeType\n\nThis object doesn't need any extra information. It will check for a valid input string and return a `datatime.time`\ninstance.\n\n\n##### TwoDimensionalNumberArray\n\nAn object helpful to convert a string into an even sized two dimensional array with automatic float conversion for each item.\nIt always expects a `number` named match group within the pattern.\n\nExample:\n```python\n\u003e\u003e\u003e array_type_1 = TwoDimensionalNumberArray(\"(?P\u003cnumber\u003e-?\\d+)\", item_array_size=1)\n\u003e\u003e\u003e array_type_2 = TwoDimensionalNumberArray(\"(?P\u003cnumber\u003e-?\\d+)\", item_array_size=2)\n\u003e\u003e\u003e array_type_3 = TwoDimensionalNumberArray(\"(?P\u003cnumber\u003e-?\\d+)\", item_array_size=3)\n\n\u003e\u003e\u003e input = \"1, 2, 4 -4, -10, 1\"\n\u003e\u003e\u003e print(array_type_1(input))\n\u003e\u003e\u003e print(array_type_2(input))\n\u003e\u003e\u003e print(array_type_3(input))\n\n[[1.0], [2.0], [4.0], [-4.0], [-10.0], [1.0]]\n[[1.0, 2.0], [4.0, -4.0], [-10.0, 1.0]]\n[[1.0, 2.0, 4.0], [-4.0, -10.0, 1.0]]\n```\n\n----\n\n### Versioning\n\n`Logmole` follows [semantic versioning](https://semver.org/).\n\n----\n\n### Extensions\n\n[ArnoldMole](https://github.com/rkoschmitzky/arnoldmole) - An Extension for the lovely [Arnold Renderer](https://www.arnoldrenderer.com)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frkoschmitzky%2Flogmole","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frkoschmitzky%2Flogmole","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frkoschmitzky%2Flogmole/lists"}