{"id":20757820,"url":"https://github.com/amdmi3/jsonslicer","last_synced_at":"2025-10-11T11:32:17.128Z","repository":{"id":55064491,"uuid":"165882846","full_name":"AMDmi3/jsonslicer","owner":"AMDmi3","description":"Stream JSON parser for Python","archived":false,"fork":false,"pushed_at":"2022-10-25T16:59:09.000Z","size":146,"stargazers_count":48,"open_issues_count":13,"forks_count":8,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-09-11T01:53:35.744Z","etag":null,"topics":["json","parser","stream","yajl"],"latest_commit_sha":null,"homepage":"https://pypi.org/project/jsonslicer/","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/AMDmi3.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-01-15T16:14:51.000Z","updated_at":"2025-03-10T02:12:34.000Z","dependencies_parsed_at":"2022-08-14T10:50:19.186Z","dependency_job_id":null,"html_url":"https://github.com/AMDmi3/jsonslicer","commit_stats":null,"previous_names":[],"tags_count":9,"template":false,"template_full_name":null,"purl":"pkg:github/AMDmi3/jsonslicer","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AMDmi3%2Fjsonslicer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AMDmi3%2Fjsonslicer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AMDmi3%2Fjsonslicer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AMDmi3%2Fjsonslicer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/AMDmi3","download_url":"https://codeload.github.com/AMDmi3/jsonslicer/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AMDmi3%2Fjsonslicer/sbom","scorecard":{"id":7070,"data":{"date":"2025-08-11","repo":{"name":"github.com/AMDmi3/jsonslicer","commit":"2d72bf2fc52a210123e6145fed5a1bcf2ce6300f"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":3.4,"checks":[{"name":"Maintained","score":0,"reason":"0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Dangerous-Workflow","score":10,"reason":"no dangerous workflow patterns detected","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Code-Review","score":0,"reason":"Found 0/30 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Token-Permissions","score":0,"reason":"detected GitHub workflow tokens with excessive permissions","details":["Warn: no topLevel permission defined: .github/workflows/ci.yml:1","Info: no jobLevel write permissions found"],"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"SAST","score":0,"reason":"no SAST tool detected","details":["Warn: no pull requests merged into dev branch"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}},{"name":"Pinned-Dependencies","score":0,"reason":"dependency not pinned by hash detected -- score normalized to 0","details":["Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/ci.yml:29: update your workflow using https://app.stepsecurity.io/secureworkflow/AMDmi3/jsonslicer/ci.yml/master?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/ci.yml:34: update your workflow using https://app.stepsecurity.io/secureworkflow/AMDmi3/jsonslicer/ci.yml/master?enable=pin","Warn: third-party GitHubAction not pinned by hash: .github/workflows/ci.yml:122: update your workflow using https://app.stepsecurity.io/secureworkflow/AMDmi3/jsonslicer/ci.yml/master?enable=pin","Warn: pipCommand not pinned by hash: .github/workflows/ci.yml:56","Warn: pipCommand not pinned by hash: .github/workflows/ci.yml:62","Warn: pipCommand not pinned by hash: .github/workflows/ci.yml:63","Warn: pipCommand not pinned by hash: .github/workflows/ci.yml:64","Info:   0 out of   2 GitHub-owned GitHubAction dependencies pinned","Info:   0 out of   1 third-party GitHubAction dependencies pinned","Info:   0 out of   4 pipCommand dependencies pinned"],"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: MIT License: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Branch-Protection","score":0,"reason":"branch protection not enabled on development/release branches","details":["Warn: branch protection not enabled for branch 'master'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}}]},"last_synced_at":"2025-08-14T13:47:30.685Z","repository_id":55064491,"created_at":"2025-08-14T13:47:30.685Z","updated_at":"2025-08-14T13:47:30.685Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279007031,"owners_count":26084227,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-11T02:00:06.511Z","response_time":55,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["json","parser","stream","yajl"],"created_at":"2024-11-17T09:45:26.133Z","updated_at":"2025-10-11T11:32:17.069Z","avatar_url":"https://github.com/AMDmi3.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# jsonslicer - stream JSON parser\n\n\u003ca href=\"https://repology.org/metapackage/python:jsonslicer/versions\"\u003e\n\t\u003cimg src=\"https://repology.org/badge/vertical-allrepos/python:jsonslicer.svg\" alt=\"jsonslicer packaging status\" align=\"right\"\u003e\n\u003c/a\u003e\n\n[![CI](https://github.com/AMDmi3/jsonslicer/actions/workflows/ci.yml/badge.svg)](https://github.com/AMDmi3/jsonslicer/actions/workflows/ci.yml)\n[![codecov](https://codecov.io/gh/AMDmi3/jsonslicer/branch/master/graph/badge.svg?token=LUBcpfIgCr)](https://codecov.io/gh/AMDmi3/jsonslicer)\n[![PyPI downloads](https://img.shields.io/pypi/dm/jsonslicer.svg)](https://pypi.org/project/jsonslicer/)\n[![PyPI version](https://img.shields.io/pypi/v/jsonslicer.svg)](https://pypi.org/project/jsonslicer/)\n[![PyPI pythons](https://img.shields.io/pypi/pyversions/jsonslicer.svg)](https://pypi.org/project/jsonslicer/)\n[![Github commits (since latest release)](https://img.shields.io/github/commits-since/AMDmi3/jsonslicer/latest.svg)](https://github.com/AMDmi3/jsonslicer)\n\n## Overview\n\nJsonSlicer performs a **stream** or **iterative**, **pull** JSON\nparsing, which means it **does not load** whole JSON into memory\nand is able to parse **very large** JSON files or streams.  The\nmodule is written in C and uses [YAJL](https://lloyd.github.io/yajl/)\nJSON parsing library, so it's also quite **fast**.\n\nJsonSlicer takes a **path** of JSON map keys or array indexes, and\nprovides **iterator interface** which yields JSON data matching\ngiven path as complete Python objects.\n\n## Example\n\n```json\n{\n    \"friends\": [\n        {\"name\": \"John\", \"age\": 31},\n        {\"name\": \"Ivan\", \"age\": 26}\n    ],\n    \"colleagues\": {\n        \"manager\": {\"name\": \"Jack\", \"age\": 33},\n        \"subordinate\": {\"name\": \"Lucy\", \"age\": 21}\n    }\n}\n```\n\n```python\nfrom jsonslicer import JsonSlicer\n\n# Extract specific elements:\nwith open('people.json') as data:\n    ivans_age = next(JsonSlicer(data, ('friends', 1, 'age')))\n    # 26\n\nwith open('people.json') as data:\n    managers_name = next(JsonSlicer(data, ('colleagues', 'manager', 'name')))\n    # 'Jack'\n\n# Iterate over collection(s) by using wildcards in the path:\nwith open('people.json') as data:\n    for person in JsonSlicer(data, ('friends', None)):\n        print(person)\n        # {'name': 'John', 'age': 31}\n        # {'name': 'Ivan', 'age': 26}\n\n# Iteration over both arrays and dicts is possible, even at the same time\nwith open('people.json') as data:\n    for person in JsonSlicer(data, (None, None)):\n        print(person)\n        # {'name': 'John', 'age': 31}\n        # {'name': 'Ivan', 'age': 26}\n        # {'name': 'Jack', 'age': 33}\n        # {'name': 'Lucy', 'age': 21}\n\n# Map key of returned objects is available on demand...\nwith open('people.json') as data:\n    for position, person in JsonSlicer(data, ('colleagues', None), path_mode='map_keys'):\n        print(position, person)\n        # 'manager' {'name': 'Jack', 'age': 33}\n        # 'subordinate' {'name': 'Lucy', 'age': 21}\n\n# ...as well as complete path information\nwith open('people.json') as data:\n    for *path, person in JsonSlicer(data, (None, None), path_mode='full'):\n        print(path, person)\n        # ('friends', 0) {'name': 'John', 'age': 31})\n        # ('friends', 1) {'name': 'Ivan', 'age': 26})\n        # ('colleagues', 'manager') {'name': 'Jack', 'age': 33})\n        # ('colleagues', 'subordinate') {'name': 'Lucy', 'age': 21})\n\n# Extract all instances of deep nested field\nwith open('people.json') as data:\n    age_sum = sum(JsonSlicer(data, (None, None, 'age')))\n    # 111\n```\n\n## API\n\n```\njsonslicer.JsonSlicer(\n    file,\n    path_prefix,\n    read_size=1024,\n    path_mode=None,\n    yajl_allow_comments=False,\n    yajl_dont_validate_strings=False,\n    yajl_allow_trailing_garbage=False,\n    yajl_allow_multiple_values=False,\n    yajl_allow_partial_values=False,\n    yajl_verbose_errors=True,\n    encoding=None,\n    errors=None,\n    binary=False,\n)\n```\n\nConstructs iterative JSON parser. which reads JSON data from _file_ (a `.read()`-supporting [file-like object](https://docs.python.org/3/glossary.html#term-file-like-object) containing a JSON document).\n\n_file_ is a `.read()`-supporting [file-like\nobject](https://docs.python.org/3/glossary.html#term-file-like-object)\ncontaining a JSON document. Both binary and text files are supported,\nbut binary ones are preferred, because the parser has to operate on\nbinary data internally anyway, and using text input would require an\nunnecessary encoding/decoding which yields ~3% performance overhead.\nNote that JsonSlicer supports both unicode and binary output regardless\nof input format.\n\n_path_prefix_ is an iterable (usually a list or a tuple) specifying\na path or a path pattern of objects which the parser should extract\nfrom JSON.\n\nFor instance, in the example above a path `('friends', 0, 'name')`\nwill yield string `'John'`, by descending from the root element\ninto the dictionary element by key `'friends'`, then into the array\nelement by index `0`, then into the dictionary element by key\n`'name'`. Note that integers only match array indexes and strings\nonly match dictionary keys.\n\nThe path can be turned into a pattern by specifying `None` as a\nplaceholder in some path positions. For instance,  `(None, None,\n'name')` will yield all four names from the example above, because\nit matches an item under 'name' key on the second nesting level of\nany arrays or map structure.\n\nBoth strings and byte objects are allowed in path, regardless of\ninput and output encodings.  are automatically converted\nto the format used internally.\n\n_read_size_ is a size of block read by the parser at a time.\n\n_path_mode_ is a string which specifies how a parser should\nreturn path information along with objects. The following modes are\nsupported:\n\n* _'ignore'_ (the default) - do not output any path information, just\nobjects as is (`'friends'`).\n\n  ```python\n  {'name': 'John', 'age': 31}\n  {'name': 'Ivan', 'age': 26}\n  {'name': 'Jack', 'age': 33}\n  {'name': 'Lucy', 'age': 21}\n  ```\n\n  Common usage pattern for this mode is\n\n  ```python\n  for object in JsonSlicer(...)\n  ```\n\n* _'map_keys'_ - output objects as is when traversing arrays and tuples\nconsisting of map key and object when traversing maps.\n\n  ```python\n  {'name': 'John', 'age': 31}\n  {'name': 'Ivan', 'age': 26}\n  ('manager', {'name': 'Jack', 'age': 33})\n  ('subordinate', {'name': 'Lucy', 'age': 21})\n  ```\n\n  This format may seem inconsistent (and therefore it's not the default),\n  however in practice only collection of a single type is iterated at\n  a time and this type is known, so this format is likely the most useful\n  as in most cases you do need dictionary keys.\n\n  Common usage pattern for this mode is\n\n  ```python\n  for object in JsonSlicer(...)  # when iterating arrays\n  for key object in JsonSlicer(...)  # when iterating maps\n  ```\n\n* _'full_paths'_ - output tuples consisting of all path components\n(both map keys and array indexes) and an object as the last element.\n\n  ```python\n  ('friends', 0, {'name': 'John', 'age': 31})\n  ('friends', 1, {'name': 'Ivan', 'age': 26})\n  ('colleagues', 'manager', {'name': 'Jack', 'age': 33})\n  ('colleagues', 'subordinate', {'name': 'Lucy', 'age': 21})\n  ```\n\n  Common usage pattern for this mode is\n\n  ```python\n  for *path, object in JsonSlicer(...)\n  ```\n\n_yajl_allow_comments_ enables corresponding YAJL flag, which is\ndocumented as follows:\n\n\u003e Ignore javascript style comments present in JSON input.  Non-standard,\n\u003e but rather fun\n\n_yajl_dont_validate_strings_ enables corresponding YAJL flag, which\nis documented as follows:\n\n\u003e When set the parser will verify that all strings in JSON input\n\u003e are valid UTF8 and will emit a parse error if this is not so.  When\n\u003e set, this option makes parsing slightly more expensive (~7% depending\n\u003e on processor and compiler in use)\n\n_yajl_allow_trailing_garbage_ enables corresponding YAJL flag, which\nis documented as follows:\n\n\u003e By default, yajl will ensure the entire input text was consumed\n\u003e and will raise an error otherwise.  Enabling this flag will cause\n\u003e yajl to disable this check.  This can be useful when parsing json\n\u003e out of a that contains more than a single JSON document.\n\n_yajl_allow_multiple_values_ enables corresponding YAJL flag, which\nis documented as follows:\n\n\u003e Allow multiple values to be parsed by a single handle.  The entire\n\u003e text must be valid JSON, and values can be seperated by any kind\n\u003e of whitespace.  This flag will change the behavior of the parser,\n\u003e and cause it continue parsing after a value is parsed, rather than\n\u003e transitioning into a complete state.  This option can be useful\n\u003e when parsing multiple values from an input stream.\n\n_yajl_allow_partial_values_ enables corresponding YAJL flag, which\nis documented as follows:\n\n\u003e When yajl_complete_parse() is called the parser will check that the\n\u003e top level value was completely consumed.  I.E., if called whilst\n\u003e in the middle of parsing a value yajl will enter an error state\n\u003e (premature EOF).  Setting this flag suppresses that check and the\n\u003e corresponding error.\n\n_yajl_verbose_errors_ enables verbose YAJL errors, with exception\nmessage including the JSON text where the error occured, along with\nan arrow pointing to the specific char.\n\n_encoding_ may be used to override output encoding, which is derived\nfrom the input file handle if possible, or otherwise set to the\ndefault one as Python builtin `open()` would use (usually `'UTF-8'`).\n\n_errors_ is an optional string that specifies how encoding and\ndecoding errors are to be handled. Defaults to `'strict'`\n\n_binary_ forces the output to be in form of `bytes` objects instead\nof `str` unicode strings.\n\nThe constructed object is as iterator. You may call `next()` to extract\nsingle element from it, iterate it via `for` loop, or use it in generator\ncomprehensions or in any place where iterator is accepted.\n\n## Performance/competitors\n\nThe closest competitor is [ijson](https://github.com/isagalaev/ijson),\nand JsonSlicer was written to be better. Namely,\n\n* It's up to 35x faster depending on ijson backend (starting with 3.0,\n  ijson supports comparable performance via yajl2_c backend), close in\n  performance to Python's native `json` module.\n* It supports more flexible paths/patterns specifying which objects\n  to iterate over in JSON hierarchy and provides consistent interface\n  for iteration over arrays and dictionaries\n\nThe results of bundled benchmark on Python 3.8.2 / clang 8.0.1 / `-O2 -DNDEBUG` / FreeBSD 12.1 amd64 / Core i7-6600U CPU @ 2.60GHz.\n\n|                                                 Facility |   Type |   Objects/sec |\n|---------------------------------------------------------:|-------:|--------------:|\n|                                             json.loads() |    str |       1147.6K |\n|                                    json.load(StringIO()) |    str |       1139.3K |\n|   **JsonSlicer (no paths, binary input, binary output)** |  bytes |       1149.7K |\n|  **JsonSlicer (no paths, unicode input, binary output)** |  bytes |       1134.5K |\n|  **JsonSlicer (no paths, binary input, unicode output)** |    str |       1012.3K |\n| **JsonSlicer (no paths, unicode input, unicode output)** |    str |        996.2K |\n|               **JsonSlicer (full paths, binary output)** |  bytes |        763.1K |\n|              **JsonSlicer (full paths, unicode output)** |    str |        567.2K |\n|                                            ijson.yajl2_c |  bytes |       1062.0K |\n|                                         ijson.yajl2_cffi |  bytes |         71.6K |\n|                                              ijson.yajl2 |  bytes |         56.4K |\n|                                             ijson.python |    str |         32.0K |\n\n## Status/TODO\n\nJsonSlicer is currently in beta stage, used in production in\n[Repology](https://repology.org) project. Testing foci are:\n\n- Edge cases with uncommon encoding (input/output) configurations\n- Absence of memory leaks\n\n## Requirements\n\n- Python 3.6+\n- [yajl](https://lloyd.github.io/yajl/) 2.0.3+ (older versions lack pkgconfig file)\n- pkg-config (build-time)\n- C++ compiler (build-time)\n\n## License\n\nMIT license, copyright (c) 2019 Dmitry Marakasov amdmi3@amdmi3.ru.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famdmi3%2Fjsonslicer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Famdmi3%2Fjsonslicer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famdmi3%2Fjsonslicer/lists"}