{"id":15046125,"url":"https://github.com/waveform80/structa","last_synced_at":"2025-09-06T13:31:56.209Z","repository":{"id":42537213,"uuid":"162118874","full_name":"waveform80/structa","owner":"waveform80","description":"A small utility for analyzing data structures (e.g. JSON files)","archived":false,"fork":false,"pushed_at":"2023-05-05T08:32:28.000Z","size":507,"stargazers_count":4,"open_issues_count":8,"forks_count":1,"subscribers_count":3,"default_branch":"master","last_synced_at":"2024-12-17T20:42:14.572Z","etag":null,"topics":["csv","data-analysis","data-visualization","datajournalism","datawrangling","json","yaml"],"latest_commit_sha":null,"homepage":"https://structa.readthedocs.io/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/waveform80.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-12-17T11:15:49.000Z","updated_at":"2023-05-05T09:06:51.000Z","dependencies_parsed_at":"2022-09-10T04:33:35.344Z","dependency_job_id":null,"html_url":"https://github.com/waveform80/structa","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/waveform80%2Fstructa","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/waveform80%2Fstructa/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/waveform80%2Fstructa/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/waveform80%2Fstructa/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/waveform80","download_url":"https://codeload.github.com/waveform80/structa/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":232125992,"owners_count":18476189,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["csv","data-analysis","data-visualization","datajournalism","datawrangling","json","yaml"],"created_at":"2024-09-24T20:52:44.482Z","updated_at":"2025-01-01T20:39:51.358Z","avatar_url":"https://github.com/waveform80.png","language":"Python","readme":"=======\nstructa\n=======\n\nstructa is a small, semi-magical utility for discerning the \"overall structure\"\nof large data files. Typically this is something like a document oriented\ndatabase in JSON format, or a CSV file of a database dump, or a YAML document.\n\n\nUsage\n=====\n\nUse from the command line::\n\n    structa \u003cfilename\u003e\n\nThe usual ``--help`` and ``--version`` switches are available for more\ninformation. The full `documentation`_ may also help understanding the myriad\nswitches!\n\n\nExamples\n========\n\nThe `People in Space API`_ shows the number of people currently in space, and\ntheir names and craft name::\n\n    curl -s http://api.open-notify.org/astros.json | structa\n\nOutput::\n\n    {\n        'message': str range=\"success\" pattern=\"success\",\n        'number': int range=10,\n        'people': [\n            {\n                'craft': str range=\"ISS\"..\"Tiangong\",\n                'name': str range=\"Akihiko Hoshide\"..\"Thomas Pesquet\"\n            }\n        ]\n    }\n\n\nThe `Python Package Index`_ (PyPI) provides a JSON API for packages. You can\nfeed the JSON of several packages to ``structa`` to get an idea of the overall\nstructure of these records (when structa is given multiple inputs on the same\ninvocation, it assumes all have a common source)::\n\n    for pkg in numpy scipy pandas matplotlib structa; do\n        curl -s https://pypi.org/pypi/$pkg/json \u003e $pkg.json\n    done\n    structa numpy.json scipy.json pandas.json matplotlib.json structa.json\n\nOutput::\n\n    {\n        'info': { str: value },\n        'last_serial': int range=11.9M..13.1M,\n        'releases': {\n            str range=\"0.1\"..\"3.5.1\": [\n                {\n                    'comment_text': str,\n                    'digests': {\n                        'md5': str pattern=\"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\",\n                        'sha256': str pattern=\"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\"\n                    },\n                    'downloads': int range=-1,\n                    'filename': str,\n                    'has_sig': bool,\n                    'md5_digest': str pattern=\"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\",\n                    'packagetype': str range=\"bdist_wheel\"..\"sdist\",\n                    'python_version': str range=\"2.4\"..\"source\",\n                    'requires_python': value,\n                    'size': int range=39.3K..118.4M,\n                    'upload_time': str of timestamp range=2006-01-09 14:02:01..2022-03-10 16:45:20 pattern=\"%Y-%m-%dT%H:%M:%S\",\n                    'upload_time_iso_8601': str of timestamp range=2009-04-06 06:19:25..2022-03-10 16:45:20 pattern=\"%Y-%m-%dT%H:%M:%S.%f%z\",\n                    'url': URL,\n                    'yanked': bool,\n                    'yanked_reason': value\n                }\n            ]\n        },\n        'urls': [\n            {\n                'comment_text': str range=\"\",\n                'digests': {\n                    'md5': str pattern=\"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\",\n                    'sha256': str pattern=\"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\"\n                },\n                'downloads': int range=-1,\n                'filename': str,\n                'has_sig': bool,\n                'md5_digest': str pattern=\"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\",\n                'packagetype': str range=\"bdist_wheel\"..\"sdist\",\n                'python_version': str range=\"cp310\"..\"source\",\n                'requires_python': value,\n                'size': int range=47.2K..55.6M,\n                'upload_time': str of timestamp range=2021-10-27 23:57:01..2022-03-10 16:45:20 pattern=\"%Y-%m-%dT%H:%M:%S\",\n                'upload_time_iso_8601': str of timestamp range=2021-10-27 23:57:01..2022-03-10 16:45:20 pattern=\"%Y-%m-%dT%H:%M:%S.%f%z\",\n                'url': URL,\n                'yanked': bool,\n                'yanked_reason': value\n            }\n        ],\n        'vulnerabilities': [ empty ]\n    }\n\n\nThe `Ubuntu Security Notices`_ database contains the list of all security\nissues in releases of Ubuntu (warning, this one takes some time to analyze and\neats about a gigabyte of RAM while doing so)::\n\n    curl -s https://usn.ubuntu.com/usn-db/database.json | structa\n\nOutput::\n\n    {\n        str range=\"1430-1\"..\"4630-1\" pattern=\"dddd-d\": {\n            'action'?: str,\n            'cves': [ str ],\n            'description': str,\n            'id': str range=\"1430-1\"..\"4630-1\" pattern=\"dddd-d\",\n            'isummary'?: str,\n            'releases': {\n                str range=\"artful\"..\"zesty\": {\n                    'allbinaries'?: {\n                        str: { 'version': str }\n                    },\n                    'archs'?: {\n                        str range=\"all\"..\"source\": {\n                            'urls': {\n                                URL: {\n                                    'md5': str pattern=\"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\",\n                                    'size': int range=20..1.2G\n                                }\n                            }\n                        }\n                    },\n                    'binaries': {\n                        str: { 'version': str }\n                    },\n                    'sources': {\n                        str: {\n                            'description': str,\n                            'version': str\n                        }\n                    }\n                }\n            },\n            'summary': str,\n            'timestamp': float of timestamp range=2012-04-27 12:57:41..2020-11-11 18:01:48,\n            'title': str\n        }\n    }\n\n.. _documentation: https://structa.readthedocs.io/\n.. _People in Space API: http://open-notify.org/Open-Notify-API/People-In-Space/\n.. _Python Package Index: https://pypi.org/\n.. _Ubuntu Security Notices: https://usn.ubuntu.com/usn-db/database.json\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwaveform80%2Fstructa","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwaveform80%2Fstructa","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwaveform80%2Fstructa/lists"}