{"id":17290245,"url":"https://github.com/somefive/mercuryjson","last_synced_at":"2025-10-13T22:12:30.800Z","repository":{"id":91316940,"uuid":"180708803","full_name":"Somefive/MercuryJson","owner":"Somefive","description":"Multi-thread version of simdjson","archived":false,"fork":false,"pushed_at":"2019-07-27T21:31:53.000Z","size":627,"stargazers_count":14,"open_issues_count":1,"forks_count":1,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-04-14T11:44:58.703Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Somefive.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-04-11T03:39:37.000Z","updated_at":"2025-02-07T01:43:14.000Z","dependencies_parsed_at":"2023-03-13T20:48:32.289Z","dependency_job_id":null,"html_url":"https://github.com/Somefive/MercuryJson","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Somefive/MercuryJson","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Somefive%2FMercuryJson","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Somefive%2FMercuryJson/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Somefive%2FMercuryJson/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Somefive%2FMercuryJson/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Somefive","download_url":"https://codeload.github.com/Somefive/MercuryJson/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Somefive%2FMercuryJson/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279017157,"owners_count":26085983,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-13T02:00:06.723Z","response_time":61,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-15T10:37:30.062Z","updated_at":"2025-10-13T22:12:30.784Z","avatar_url":"https://github.com/Somefive.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# MercuryJson: Multi-Threaded JSON Parsing with SIMD\n\nThis repository contains the source code for our course project in CMU 15-618: Parallel Computer Architecture and Programming.\n\nThe project is still a proof-of-concept. The currently supported functions are parsing/validating and pretty-printing. We plan to add an iterator interface soon. \n\n## (Brief) Introduction\n\nMercuryJson is a fast JSON parser optimized for parsing very large documents. The idea is based mainly on the two-stage parsing framework of [simdjson](https://github.com/lemire/simdjson). Our main contribution is that we parallelized the second stage using multi-threading.\n\nBenchmarks show that we achieve considerable speedup on large (\u003e 500MB) documents and comparable performance on most small (\u003c 3MB) documents.\n\nFor a detailed description of the algorithms and benchmarks, please refer to our [report](https://github.com/Somefive/MercuryJson/blob/master/report/report.pdf).\n\n## Installation\n\nTo build MercuryJson, you will need:\n\n- CMake version 3.0 and after.\n- C++ compiler supporting the C++17 standard.\n- Linux or macOS. Windows is not yet supported.\n- An Intel CPU supporting the AVX2 instruction set.\n\nBuilding commands are:\n\n```bash\ngit clone https://github.com/Somefive/MercuryJson\ncd MercuryJson\nmkdir build \u0026\u0026 cd build\ncmake ..\nmake\n```\n\nThis will generate a binary named `main` under the `build` directory. This program is used for benchmarking: it reports timing for parsing of the given document. Here is an example output:\n\n```\n$ ./main ../data/large/citylots.json\nFile size: 189778220\nStructural characters: 33395428\nIteration 0: stage 1 runtime: 0.197731 s, stage 2 runtime: 0.179965 s\nIteration 1: stage 1 runtime: 0.208775 s, stage 2 runtime: 0.173601 s\nIteration 2: stage 1 runtime: 0.196363 s, stage 2 runtime: 0.171210 s\nIteration 3: stage 1 runtime: 0.199372 s, stage 2 runtime: 0.171221 s\nIteration 4: stage 1 runtime: 0.207167 s, stage 2 runtime: 0.173756 s\nIteration 5: stage 1 runtime: 0.194635 s, stage 2 runtime: 0.173653 s\nIteration 6: stage 1 runtime: 0.208866 s, stage 2 runtime: 0.175466 s\nIteration 7: stage 1 runtime: 0.196667 s, stage 2 runtime: 0.169261 s\nIteration 8: stage 1 runtime: 0.193073 s, stage 2 runtime: 0.171227 s\nIteration 9: stage 1 runtime: 0.192630 s, stage 2 runtime: 0.168801 s\nAverage runtime: 0.372344 s, speed: 486.07 MB/s\nAverage stage 1 runtime: 0.199528 s (53.59 %), stage 2 runtime: 0.172816 s (46.41 %)\nBest runtime: 0.361431 s, speed: 500.75 MB/s\n```\n\nAll configurable flags are stored in `src/flags.h`. Note that the number of threads to use is hardcoded at compile time.\n\n## Caveats\n\nThe following features are not yet supported by our parser:\n\n- Null characters (`'\\0'`) within strings; currently we use null-terminated C-style strings.\n- Conversion \u0026 validation of escaped Unicode characters.\n- Comments (`/**/`).\n\nThe following incorrect JSON fragments are accepted by our parser:\n\n- Unescaped control characters within strings.\n- Invalid escape sequences.\n- Escaped characters outside strings.\n\nFor detailed discussion on JSON standards, please see [JSON Test Suite](https://github.com/nst/JSONTestSuite).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsomefive%2Fmercuryjson","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsomefive%2Fmercuryjson","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsomefive%2Fmercuryjson/lists"}