{"id":16722201,"url":"https://github.com/tkluck/pandas-nesteddata","last_synced_at":"2025-03-15T13:21:20.839Z","repository":{"id":147863848,"uuid":"78058466","full_name":"tkluck/pandas-nesteddata","owner":"tkluck","description":"Transform hierarchical data (nested arrays/hashes) to a pandas DataFrame according to a compact, readable, user-specified pattern","archived":false,"fork":false,"pushed_at":"2017-01-07T21:32:37.000Z","size":26,"stargazers_count":4,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-01-22T03:32:58.944Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tkluck.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-01-04T22:33:29.000Z","updated_at":"2017-01-17T10:05:08.000Z","dependencies_parsed_at":"2023-05-27T18:00:18.271Z","dependency_job_id":null,"html_url":"https://github.com/tkluck/pandas-nesteddata","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tkluck%2Fpandas-nesteddata","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tkluck%2Fpandas-nesteddata/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tkluck%2Fpandas-nesteddata/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tkluck%2Fpandas-nesteddata/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tkluck","download_url":"https://codeload.github.com/tkluck/pandas-nesteddata/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243733335,"owners_count":20339026,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-12T22:33:59.181Z","updated_at":"2025-03-15T13:21:20.834Z","avatar_url":"https://github.com/tkluck.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"pandas-nesteddata version 0.1\n=============================\n\nThis module transforms hierarchical data (nested arrays/hashes) to\na pandas DataFrame according to a compact, readable, user-specified pattern.\n\nFor example, the pattern `.\u003cindex\u003e.*` transforms a data structure\nof the form\n\n    \u003e\u003e\u003e data = [{ 'a': 1, 'b': 2 }, { 'a': 3, 'b': 4 }]\n\nto the DataFrame\n\n           a  b\n    index\n    0      1  2\n    1      3  4\n\nOr, in code:\n\n    \u003e\u003e\u003e from nesteddata import to_dataframe\n    \u003e\u003e\u003e to_dataframe('.\u003cindex\u003e.*', data)\n           a  b\n    index      \n    0      1  2\n    1      3  4\n\nThe pattern `.*.*` applied to the same data gives the output\n\n    \u003e\u003e\u003e to_dataframe('.*.*', data)\n       0_a  0_b  1_a  1_b\n    0    1    2    3    4\n\nThe pattern `.*.\u003ckey\u003e` gives the output\n\n    \u003e\u003e\u003e to_dataframe('.*.\u003ckey\u003e', data)\n         0  1\n    key      \n    a    1  3\n    b    2  4\n\nIt is hoped that the pattern specification is sufficiently powerful for this\nmodule to replace a lot of simple boiler-plate data transformations.\n\nPATTERN SPECIFICATION\n---------------------\n\nThe dot-separated components represent the following:\n\n- `\u003cname\u003e` represents that the keys at that position should be put in a column\n  named name in the csv output. The values belonging to those keys become rows;\n- `*` represents that the keys at that position in the pattern should be\n  interpreted as column names; their values should be the values for that\n  column, all beloning to the same row;\n- `{column_name}` or `{column_name_1,column_name_2,...}` is similar to `*`, but\n  instead of capturing all the keys at that level of the hierarchy, it only\n  captures the named columns.\n- `[\u003cnumber\u003e]` represents a numerical literal key, for indexing arrays or\n  dictionaries with keys of type `int`.\n- anything else represents a literal key name.\n- If your pattern does not contain `*` or `{...}`, you need to pass an\n  additional `column_name=` parameter to `to_dataframe` to specify the name\n  for the single column where the value will go.\n\nFor the purposes of this description, an array should be seen as a collection\nof index =\u003e value pairs.\n\nIt is possible to specify several dot-separated paths in a single pattern,\nseparated by spaces. In that case, all the paths need to have the same primary\nkey (that is, the same set of names in `\u003c...\u003e`). Rows will be formed by joining\nthe columns resulting from the different paths.\n\nESCAPING SPECIAL CHARACTERS\n---------------------------\n\nThe characters `\u003c\u003e{}*[].` have a special meaning and as such, cannot be part\nof a literal key. More precisely, if they are in such position that they can\nbe interpreted with their special meaning, this takes precedence.\n\nAllowing a way to escape these special characters will be part of a future\nrelease. For now, look at 'Building the pattern from data structures' below.\n\nBUILDING THE PATTERN FROM DATA STRUCTURES\n-----------------------------------------\n\nAs an alternative to passing the pattern as a string that needs to be parsed,\nit is also possible to pass the pattern as a data structure. For example, the\npattern\n\n    .*.\u003ckey\u003e\n\ncan also be represented as\n\n    \u003e\u003e\u003e from nesteddata import Glob, Index\n    \u003e\u003e\u003e pattern = Glob() + Index('key')\n    \u003e\u003e\u003e pattern\n    Glob() + Index('key')\n    \u003e\u003e\u003e pattern.to_dataframe(data)\n         0  1\n    key      \n    a    1  3\n    b    2  4\n\nThe constructor functions are:\n\n- `Index(name)` (correponds to `\u003cname\u003e`)\n- `Glob()` (corresponds to `*`)\n- `Columns(*column_names)` (corresponds to `{column_name_1,..,column_name_n}`)\n- `Literal(key)` (correponds to a literal string key or a `[\u003cnumber\u003e]` integer key)\n- `Join(*chunks)` (corresponds to space-separated pattern chunks)\n\n\nINSTALLATION\n------------\n\nTo install this module type the following:\n\n    python setup.py\n    sudo python setup.py install\n\nDEPENDENCIES\n------------\n\nThis module requires these other modules and libraries:\n\n    pandas\n\nCOPYRIGHT AND LICENCE\n---------------------\n\nCopyright (C) 2017 by Timo Kluck\n\nThis library is free software; you can redistribute it and/or modify\nit under the terms of the General Public License, version 3 or later.\nA copy of this license can be found in LICENSE.\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftkluck%2Fpandas-nesteddata","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftkluck%2Fpandas-nesteddata","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftkluck%2Fpandas-nesteddata/lists"}