{"id":13490637,"url":"https://github.com/TeskaLabs/cysimdjson","last_synced_at":"2025-03-28T06:31:40.957Z","repository":{"id":41358678,"uuid":"345812229","full_name":"TeskaLabs/cysimdjson","owner":"TeskaLabs","description":"Very fast Python JSON parsing library","archived":false,"fork":false,"pushed_at":"2024-03-22T15:33:11.000Z","size":3294,"stargazers_count":376,"open_issues_count":13,"forks_count":18,"subscribers_count":6,"default_branch":"main","last_synced_at":"2025-03-05T05:40:12.956Z","etag":null,"topics":["cython","json","python","simdjson"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/TeskaLabs.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-03-08T22:24:59.000Z","updated_at":"2025-02-25T03:31:59.000Z","dependencies_parsed_at":"2024-01-16T09:05:40.967Z","dependency_job_id":"51571bb1-22c5-42cb-be60-afcf97aed1a1","html_url":"https://github.com/TeskaLabs/cysimdjson","commit_stats":{"total_commits":67,"total_committers":4,"mean_commits":16.75,"dds":0.05970149253731338,"last_synced_commit":"5f544eeb2a9c1cfa0d48a5bb00a742f2a78d3beb"},"previous_names":[],"tags_count":8,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TeskaLabs%2Fcysimdjson","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TeskaLabs%2Fcysimdjson/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TeskaLabs%2Fcysimdjson/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TeskaLabs%2Fcysimdjson/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/TeskaLabs","download_url":"https://codeload.github.com/TeskaLabs/cysimdjson/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245984577,"owners_count":20704793,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cython","json","python","simdjson"],"created_at":"2024-07-31T19:00:49.707Z","updated_at":"2025-03-28T06:31:39.578Z","avatar_url":"https://github.com/TeskaLabs.png","language":"C++","funding_links":[],"categories":["C++"],"sub_categories":[],"readme":"# cysimdjson\n\nFast JSON parsing library for Python, 7-12 times faster than standard Python JSON parser.  \nIt is Python bindings for the [simdjson](https://simdjson.org) using [Cython](https://cython.org).\n\nStandard [Python JSON parser](https://docs.python.org/3/library/json.html) (`json.load()` etc.) is relatively slow,\nand if you need to parse large JSON files or a large number of small JSON files,\nit may represent a significant bottleneck.\n\nWhilst there are other fast Python JSON parsers, such as [pysimdjson](https://github.com/TkTech/pysimdjson), [libpy_simdjson](https://github.com/gerrymanoim/libpy_simdjson) or [orjson](https://github.com/ijl/orjson), they don't reach the raw speed that is provided by the brilliant [SIMDJSON](https://simdjson.org) project. SIMDJSON is C++ JSON parser based on [SIMD instructions](https://en.wikipedia.org/wiki/SIMD), reportedly the fastest JSON parser on the planet.\n\n[![Python 3.11](https://github.com/TeskaLabs/cysimdjson/actions/workflows/py311.yaml/badge.svg)](https://github.com/TeskaLabs/cysimdjson/actions/workflows/py311.yaml)\n[![Python 3.10](https://github.com/TeskaLabs/cysimdjson/actions/workflows/py310.yaml/badge.svg)](https://github.com/TeskaLabs/cysimdjson/actions/workflows/py310.yaml)  \n[![Python 3.9](https://github.com/TeskaLabs/cysimdjson/actions/workflows/py39.yaml/badge.svg)](https://github.com/TeskaLabs/cysimdjson/actions/workflows/py39.yaml)\n[![Python 3.8](https://github.com/TeskaLabs/cysimdjson/actions/workflows/py38.yaml/badge.svg)](https://github.com/TeskaLabs/cysimdjson/actions/workflows/py38.yaml)\n[![Python 3.7](https://github.com/TeskaLabs/cysimdjson/actions/workflows/py37.yaml/badge.svg)](https://github.com/TeskaLabs/cysimdjson/actions/workflows/py37.yaml)  \n\n## Usage\n\n```python\nimport cysimdjson\n\njson_bytes = b'''\n{\n  \"foo\": [1,2,[3]]\n}\n'''\n\nparser = cysimdjson.JSONParser()\njson_element = parser.parse(json_bytes)\n\n# Access using JSON Pointer\nprint(json_element.at_pointer(\"/foo/2/0\"))\n```\n\n_Note: `parser` object can be reused for maximum performance._\n\n\n### Pythonic drop-in API\n\n```python\nparser = cysimdjson.JSONParser()\njson_parsed = parser.loads(json_bytes)\n\n# Access in a Python way\nprint(json_parsed.json_parsed['foo'])\n```\n\nThe `json_parsed` is a read-only dictionary-like object, that provides an access to JSON data.\n\n**WARNING:** This method of the access will be deprecated in the future, likely in favour of JSON Pointer.\n\n\n## Trade-offs\n\nThe speed of `cysimdjson` is based on these assumptions:\n\n1) The output of the parser is read-only, you cannot modify it\n2) The output of the parser is not Python dictionary, but lazily evaluated dictionary-like object\n3) The parser output is valid only until `JSONParser` object is still alive (not destroyed), otherwise you will get ugly errors\n4) If you convert the parser output into a Python dictionary, you will lose the speed\n\nIf your design is not aligned with these assumptions, `cysimdjson` is not a good choice.\n\n\n## Documentation\n\n`JSONParser.parse(json_bytes)`\n\nParse JSON `json_bytes`, represented as `bytes`.\n\n\n`JSONParser.parse_in_place(bytes)`\n\nParse JSON `json_bytes`, represented as `bytes`, assuming that there is a padding expected by SIMDJSON.\nThis is the fastest parsing variant.\n\n\n`JSONParser.parse_string(string)`\n\nParse JSON `json_bytes`, represented as `str` (string).\n\n\n`JSONParser.load(path)`\n\n\n## Installation\n\n```\npip3 install cysimdjson\n```\n\nProject `cysimdjson` is distributed via PyPI: https://pypi.org/project/cysimdjson/ .\n\nIf you want to install `cysimdjson` from source, you need to install Cython first: `pip3 install cython`.\n\n\n## Performance\n\n```\n----------------------------------------------------------------\n# 'jsonexamples/test.json' 2397 bytes\n----------------------------------------------------------------\n* cysimdjson parse          510291.81 EPS (  1.00)  1223.17 MB/s\n* libpy_simdjson loads      374615.54 EPS (  1.36)   897.95 MB/s\n* pysimdjson parse          362195.46 EPS (  1.41)   868.18 MB/s\n* orjson loads              110615.70 EPS (  4.61)   265.15 MB/s\n* python json loads          72096.80 EPS (  7.08)   172.82 MB/s\n----------------------------------------------------------------\n\nSIMDJSON: 543335.93 EPS, 1241.52 MB/s\n```\n\n```\n----------------------------------------------------------------\n# 'jsonexamples/twitter.json' 631515 bytes\n----------------------------------------------------------------\n* cysimdjson parse            2556.10 EPS (  1.00)  1614.22 MB/s\n* libpy_simdjson loads        2444.53 EPS (  1.05)  1543.76 MB/s\n* pysimdjson parse            2415.46 EPS (  1.06)  1525.40 MB/s\n* orjson loads                 387.11 EPS (  6.60)   244.47 MB/s\n* python json loads            278.63 EPS (  9.17)   175.96 MB/s\n----------------------------------------------------------------\n\nSIMDJSON: 2536.16 EPS,  1527.28 MB/s\n```\n\n```\n----------------------------------------------------------------\n# 'jsonexamples/canada.json' 2251051 bytes\n----------------------------------------------------------------\n* cysimdjson parse             284.67 EPS (  1.00)   640.81 MB/s\n* pysimdjson parse             284.62 EPS (  1.00)   640.70 MB/s\n* libpy_simdjson loads         277.13 EPS (  1.03)   623.84 MB/s\n* orjson loads                  81.80 EPS (  3.48)   184.13 MB/s\n* python json loads             22.52 EPS ( 12.64)    50.68 MB/s\n----------------------------------------------------------------\n\nSIMDJSON: 307.95 EPS, 661.08 MB/s\n```\n\n```\n----------------------------------------------------------------\n# 'jsonexamples/gsoc-2018.json' 3327831 bytes\n----------------------------------------------------------------\n* cysimdjson parse             775.61 EPS (  1.00)  2581.09 MB/s\n* pysimdjson parse             743.67 EPS (  1.04)  2474.81 MB/s\n* libpy_simdjson loads         654.15 EPS (  1.19)  2176.88 MB/s\n* orjson loads                 166.67 EPS (  4.65)   554.66 MB/s\n* python json loads            113.72 EPS (  6.82)   378.43 MB/s\n----------------------------------------------------------------\n\nSIMDJSON: 703.59 EPS, 2232.92 MB/s\n```\n\n```\n----------------------------------------------------------------\n# 'jsonexamples/verysmall.json' 7 bytes\n----------------------------------------------------------------\n* cysimdjson parse         3972376.53 EPS (  1.00)    27.81 MB/s\n* orjson loads             3637369.63 EPS (  1.09)    25.46 MB/s\n* libpy_simdjson loads     1774211.19 EPS (  2.24)    12.42 MB/s\n* pysimdjson parse          977530.90 EPS (  4.06)     6.84 MB/s\n* python json loads         527932.65 EPS (  7.52)     3.70 MB/s\n----------------------------------------------------------------\n\nSIMDJSON: 3799392.10 EPS\n```\n\nCPU: AMD EPYC 7452\n\nMore performance testing:\n\n * [Apple M1](https://github.com/TeskaLabs/cysimdjson/wiki/Performance-on-Apple-M1): \u003e 1M EPS, \u003e 3GB/s\n\n\n\n### Tests are reproducible\n\n```\npip3 install orjson\npip3 install pysimdjson\npip3 install libpy_simdjson\npython3 setup.py build_ext --inplace\nPYTHONPATH=. python3 ./perftest/test_benchmark.py\n```\n\n## Manual build\n\n```\npython3 setup.py build_ext --inplace\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FTeskaLabs%2Fcysimdjson","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FTeskaLabs%2Fcysimdjson","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FTeskaLabs%2Fcysimdjson/lists"}