{"id":21660275,"url":"https://github.com/slub/ldjstructurestats","last_synced_at":"2026-05-21T14:11:40.544Z","repository":{"id":150196205,"uuid":"200838500","full_name":"slub/ldjstructurestats","owner":"slub","description":"a commandline command (Python3 program) that determines the structure of given line-delimited JSON records","archived":false,"fork":false,"pushed_at":"2019-08-30T11:44:04.000Z","size":15,"stargazers_count":2,"open_issues_count":0,"forks_count":2,"subscribers_count":7,"default_branch":"master","last_synced_at":"2025-03-20T05:28:48.536Z","etag":null,"topics":["command-line-tool","json","line-delimited-json","python","statistics"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/slub.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-08-06T11:32:33.000Z","updated_at":"2023-12-16T01:41:06.000Z","dependencies_parsed_at":"2023-04-20T03:02:51.787Z","dependency_job_id":null,"html_url":"https://github.com/slub/ldjstructurestats","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/slub/ldjstructurestats","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/slub%2Fldjstructurestats","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/slub%2Fldjstructurestats/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/slub%2Fldjstructurestats/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/slub%2Fldjstructurestats/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/slub","download_url":"https://codeload.github.com/slub/ldjstructurestats/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/slub%2Fldjstructurestats/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33303239,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-21T12:23:38.849Z","status":"ssl_error","status_checked_at":"2026-05-21T12:22:11.673Z","response_time":62,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["command-line-tool","json","line-delimited-json","python","statistics"],"created_at":"2024-11-25T09:32:37.056Z","updated_at":"2026-05-21T14:11:40.528Z","avatar_url":"https://github.com/slub.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ldjstructurestats - line-delimited JSON structure statistics\n\nldjstructurestats is a commandline command (Python3 program) that determines the structure of given line-delimited JSON records.\n\nThis tool can be utilised to discover the structure (+ schema) of given line-delimited JSON records and possible differences/variants at single field paths. Furthermore, it also illustrates, whether a certain field path can contain multiple values or not.\n\n## Usage\n\nIt eats line-delimited JSON records from *stdin*.\n\nIt puts structure statistics calculated from the given line-delimited JSON records as pure CSV to *stdout*.\n\n```\nldjstructurestats\n\noptional arguments:\n  -h, --help                           show this help message and exit\n```\n\n* example:\n    ```\n    ldjstructurestats \u003c [INPUT LINE-DELIMITED JSON RECORDS] \u003e [PATH TO THE OUTPUT CSV FILE]\n    ```\n## Run\n\n* clone this git repo or just download the [ldjstructurestats.py](ldjstructurestats/ldjstructurestats.py) file\n* run ./ldjstructurestats.py\n* for a hackish way to use ldjstructurestats system-wide, copy to /usr/local/bin\n\n### Install system-wide via pip\n\n```\nsudo -H pip3 install --upgrade [ABSOLUTE PATH TO YOUR LOCAL GIT REPOSITORY OF LDJSTRUCTURESTATS]\n```\n(which provides you ```ldjstructurestats``` as a system-wide commandline command)\n\n## Description\n\n#### Statistics Header\n\n#### path_number\n\n* a number/count for each (simple) path/field\n\n#### field_path\n\n* the field path\n* rows that have a number in the column 'path_number', contain the simple field path\n  * field names (/keys) in simple field paths are separated by a `.`  \n* rows below a line with a simple field path, contain the structure variants/mutations of this simple field path\n  * field names (/keys) in structure variant field paths are separated by ` \u003e `\n  * field names (/keys) in structure variant field paths are enclosed in quotation marks (`\"`) \n  * indent = 1 tab + '↳' at the beginning of the field path\n* rows below structure variants/mutations that end with a *JSON array*, contain the structure variant field paths incl. the object types that can occur in this *JSON array*\n  * indent = 2 tabs + '↳' at the beginning of the field path\n\n#### multiple_paths\n* indicates whether a simple field path occurs multiple times in a record or not (**only** at simple field paths)\n   * `True` = this simple field path occurs multiple times in at least one record of the input record set\n   * this value is simply determined by a comparison of **path_existing** and **path_occurrence**, if **path_occurrence** is larger than **path_existing**, then there must be at least on record that contains this simple field path multiple times\n\n#### multiple_values\n\n* **only** if, the field path (i.e. structure variant field path) ends with a *JSON array*, this column is either filled with `True` or `False`\n   * `True` = this field path has *JSON arrays* with multiple values\n   * `False` = this field path only has *JSON arrays* that do not contain multiple values, i.e., they are single-valued\n\n#### path_existing\n* the number of records where this (simple) field path exists, i.e., records where this (simple) field path occurs at least once\n\n#### path_occurrence\n\n* the occurrence count of the simple field path or its structure variant in the line-delimited JSON records\n\nnote: if **multiple_paths** is `True` (at the simple field path) and/or **multiple_values** is `True` (at a related structure variant field path that ends with a *JSON array*), then it's an indicator that a mapping from such a field path could produces multiple values in the output\n\n### Structure Field Path Notation\n\n* behind each field name in a structure field path is a notation of the object type of the value of this field (/key)\n* these object types include all possible JSON object types, which are\n\n   |Object Type Notation|JSON Object Type|\n   |--------------------|----------------|\n   | {} | JSON object |\n   | [] | JSON array |\n   | (Integer) | integer number |\n   | (String) | string value |\n   | (Decimal) | decimal number |\n   | (Boolean) | boolean value |\n   | (no object type) | `null` value, i.e., no value |\n\n* the notation of the *JSON array* object types is `[]` + `[ARRAY VALUE OBJECT TYPE]`, i.e. `[ARRAY VALUE OBJECT TYPE]` can be one of `{}`, `(Integer)`, `(String)`, `(Decimal)`, `(Boolean)` or `(no object type)`, e.g. `[] {}` or `[] (String)`\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fslub%2Fldjstructurestats","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fslub%2Fldjstructurestats","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fslub%2Fldjstructurestats/lists"}