{"id":15690683,"url":"https://github.com/cmungall/json-flattener","last_synced_at":"2025-07-20T16:03:26.385Z","repository":{"id":43497488,"uuid":"390561406","full_name":"cmungall/json-flattener","owner":"cmungall","description":"Python library for denormalizing nested dicts or json objects to tables and back","archived":false,"fork":false,"pushed_at":"2024-03-21T03:02:31.000Z","size":101,"stargazers_count":10,"open_issues_count":3,"forks_count":4,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-07-11T06:49:57.506Z","etag":null,"topics":["dataframes","denormalization","json","linkml","pandas","yaml"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cmungall.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-07-29T01:31:08.000Z","updated_at":"2025-05-28T08:30:37.000Z","dependencies_parsed_at":"2024-06-21T02:12:30.047Z","dependency_job_id":"46f43454-6586-49df-ae38-2f2ba9f54452","html_url":"https://github.com/cmungall/json-flattener","commit_stats":{"total_commits":36,"total_committers":2,"mean_commits":18.0,"dds":0.02777777777777779,"last_synced_commit":"2b4f40d742877c7668d98d6b750b70734c0ebf3e"},"previous_names":[],"tags_count":10,"template":false,"template_full_name":null,"purl":"pkg:github/cmungall/json-flattener","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cmungall%2Fjson-flattener","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cmungall%2Fjson-flattener/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cmungall%2Fjson-flattener/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cmungall%2Fjson-flattener/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cmungall","download_url":"https://codeload.github.com/cmungall/json-flattener/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cmungall%2Fjson-flattener/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266152253,"owners_count":23884473,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dataframes","denormalization","json","linkml","pandas","yaml"],"created_at":"2024-10-03T18:13:55.591Z","updated_at":"2025-07-20T16:03:26.357Z","avatar_url":"https://github.com/cmungall.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# json-flattener\n\nPython library for denormalizing/flattening lists of complex objects to tables/data frames, with roundtripping\n\n## Notebook Example\n\n[EXAMPLE.ipynb](https://github.com/cmungall/json-flattener/blob/main/EXAMPLE.ipynb)\n\n## Description\n\nGiven YAML/JSON/JSON-Lines such as:\n\n```yaml\n- id: S001\n  name: Lord of the Rings\n  genres:\n    - fantasy\n  creator:\n    name: JRR Tolkein\n    from_country: England\n  books:\n    - id: S001.1\n      name: Fellowship of the Ring\n      price: 5.99\n      summary: Hobbits\n    - id: S001.2\n      name: The Two Towers\n      price: 5.99\n      summary: More hobbits\n    - id: S001.3\n      name: Return of the King\n      price: 6.99\n      summary: Yet more hobbits\n- id: S002\n  name: The Culture Series\n  genres:\n    - scifi\n  creator:\n    name: Ian M Banks\n    from_country: Scotland\n  books:\n    - id: S002.1\n      name: Consider Phlebas\n      price: 5.99\n    - id: S002.2\n      name: Player of Games\n      price: 5.99\n```\n\nDenormalize using `jfl` command:\n\n```bash\njfl flatten -C creator=flat -C books=multivalued -i examples/books1.yaml -o examples/books1-flattened.tsv\n```\n\n\n\n|id|name|genres|creator_name|creator_from_country|books_name|books_summary|books_price|books_id|creator_genres\n|---|---|---|---|---|---|---|---|---|---|\n|S001|Lord of the Rings|[fantasy]|JRR Tolkein|England|[Fellowship of the Ring\\|The Two Towers\\|Return of the King]|[Hobbits\\|More hobbits\\|Yet more hobbits]|[5.99\\|5.99\\|6.99]|[S001.1\\|S001.2\\|S001.3]|\n|S002|The Culture Series|[scifi]|Ian M Banks|Scotland|[Consider Phlebas\\|Player of Games]||[5.99\\|5.99]|[S002.1\\|S002.2]|\n\n\nTo convert back to JSON/YAML we must first cache the generated mappings when we do the flatten with `-O`:\n\n```bash\njfl flatten -C creator=flat -C books=multivalued -i examples/books1.yaml -O examples/conf.yaml -o examples/books1-flattened.tsv\n```\n\nThen pass this as an argument\n\n```bash\njfl unflatten -C creator=flat -C books=multivalued -i examples/books1.tsv -c examples/conf.yaml -o examples/books1.yaml\n```\n\n\n\nThis library also allows complex fields to be directly serialized as json or yaml (the default is to append `_json` to the key). For example:\n\n```bash\njfl flatten -C creator=json -C books=json -i examples/books1.yaml -o examples/books1-jsonified.tsv\n```\n\n|id|name|genres|creator_json|books_json|\n|---|---|---|---|---|\n|S001|Lord of the Rings|[fantasy]|{\\\"name\\\": \\\"JRR Tolkein\\\", \\\"from_country\\\": \\\"England\\\"}|[{\\\"id\\\": \\\"S001.1\\\", \\\"name\\\": \\\"Fellowship of the Ring\\\", \\\"summary\\\": \\\"Hobbits\\\", \\\"price\\\": 5.99}, {\\\"id\\\": \\\"S001.2\\\", \\\"name\\\": \\\"The Two Towers\\\", \\\"summary\\\": \\\"More hobbits\\\", \\\"price\\\": 5.99}, {\\\"id\\\": \\\"S001.3\\\", \\\"name\\\": \\\"Return of the King\\\", \\\"summary\\\": \\\"Yet more hobbits\\\", \\\"price\\\": 6.99}]|\n|S002|The Culture Series|[scifi]|{\\\"name\\\": \\\"Ian M Banks\\\", \\\"from_country\\\": \\\"Scotland\\\"}|[{\\\"id\\\": \\\"S002.1\\\", \\\"name\\\": \\\"Consider Phlebas\\\", \\\"price\\\": 5.99}, {\\\"id\\\": \\\"S002.2\\\", \\\"name\\\": \\\"Player of Games\\\", \\\"price\\\": 5.99}]|\n|S003|Book of the New Sun|[scifi, fantasy]|{\\\"name\\\": \\\"Gene Wolfe\\\", \\\"genres\\\": [\\\"scifi\\\", \\\"fantasy\\\"], \\\"from_country\\\": \\\"USA\\\"}|[{\\\"id\\\": \\\"S003.1\\\", \\\"name\\\": \\\"Shadow of the Torturer\\\"}, {\\\"id\\\": \\\"S003.2\\\", \\\"name\\\": \\\"Claw of the Conciliator\\\", \\\"price\\\": 6.99}]|\n|S004|Example with single book||{\\\"name\\\": \\\"Ms Writer\\\", \\\"genres\\\": [\\\"romance\\\"], \\\"from_country\\\": \\\"USA\\\"}|[{\\\"id\\\": \\\"S004.1\\\", \\\"name\\\": \\\"Blah\\\"}]|\n|S005|Example with no books||{\\\"name\\\": \\\"Mr Unproductive\\\", \\\"genres\\\": [\\\"romance\\\", \\\"scifi\\\", \\\"fantasy\\\"], \\\"from_country\\\": \\\"USA\\\"}||\n\n\nSee\n\n\u003ciframe src=\"https://docs.google.com/presentation/d/e/2PACX-1vRyM06peU9BkrZbXJazuMlajw5s4Vbj5f0t0TE4hj_X9Ex_EASLSUZuaWUxYIhWbOC6CtPRtxrTGWQD/embed?start=false\u0026loop=false\u0026delayms=60000\" frameborder=\"0\" width=\"960\" height=\"569\" allowfullscreen=\"true\" mozallowfullscreen=\"true\" webkitallowfullscreen=\"true\"\u003e\u003c/iframe\u003e\n\nThe primary use case is to go from a rich *normalized* data model (as python objects, JSON, or YAML) to a flatter representation that is amenable to processing with:\n\n * Solr/Lucene\n * Pandas/R Dataframes\n * Excel/Google sheets\n * Unix cut/grep/cat/etc\n * Simple denormalized SQL database representations\n\nThe target denormalized format is a list of rows / a data matrix, where each cell is either an atom or a list of atoms.\n\n\n## Usage from Python\n\n```python\ndict = {\n            \"id\": \"A1\",\n            \"subject\": {\"id\": \"G1\", \"name\": \"gene1\", \"category\": \"gene\"},\n            \"object\": {\"id\": \"T1\", \"name\": \"term1\", \"category\": \"term\"},\n            \"publications\": [\"PMID1\", \"PMID2\"],\n            \"closure\": [\n                {\"id\": \"X1\", \"name\": \"x1\"},\n                {\"id\": \"X2\", \"name\": \"x2\"},\n                {\"id\": \"X3\", \"name\": \"x3\"},\n            ],\n        }\nkconfig = {\n            \"subject\": KeyConfig(delete=True, serializers=\"yaml\"),\n            \"object\": KeyConfig(delete=True, flatten=True),\n            \"closure\": KeyConfig(delete=True, is_list=True, flatten=True),\n        }\nconfig = GlobalConfig(key_configs=kconfig)\nflattened_objs = flatten(objs, config)\n```\n\n## Method\n\n * Each top level key becomes a column\n * if the key value is a dict/object, then flatten\n     * by default a '_' is used to separate the parent key from the inner key\n     * e.g. the composition of `creator` and `from_country` becomes `creator_from_country`\n     * currently one level of flattening is supported\n * if the key value is a list of atomic entities, then leave as is\n * if the key value is a list of dicts/objects, then flatten each key of this inner dict into a list\n     * e.g. if `books` is a list of book objects, and `name` is a key on book, then `books_name` is a list of names of each book\n     * order is significant - the first element of `books_name` is matched to the first element of `books_price`, etc\n * Allow any key to be serialized as yaml/json/pickle if configured\n\n## Comparison\n\n### Pandas json_normalize\n\n\n - https://pandas.pydata.org/pandas-docs/version/0.25.0/reference/api/pandas.io.json.json_normalize.html\n\n### Java json-flattener\n\n https://github.com/wnameless/json-flattener\n\n### Python\n\n### csvjson\n\nhttps://csvjson.com/json2csv\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcmungall%2Fjson-flattener","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcmungall%2Fjson-flattener","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcmungall%2Fjson-flattener/lists"}