{"id":28410498,"url":"https://github.com/simdjson/fuzzyjson","last_synced_at":"2025-07-19T05:07:51.283Z","repository":{"id":106307984,"uuid":"189037464","full_name":"simdjson/fuzzyjson","owner":"simdjson","description":null,"archived":false,"fork":false,"pushed_at":"2019-09-01T03:31:24.000Z","size":6449,"stargazers_count":4,"open_issues_count":0,"forks_count":2,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-06-21T21:36:19.439Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/simdjson.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2019-05-28T13:49:05.000Z","updated_at":"2020-10-09T18:14:09.000Z","dependencies_parsed_at":null,"dependency_job_id":"eccdda5f-e298-40d5-a598-522a457a562f","html_url":"https://github.com/simdjson/fuzzyjson","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/simdjson/fuzzyjson","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simdjson%2Ffuzzyjson","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simdjson%2Ffuzzyjson/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simdjson%2Ffuzzyjson/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simdjson%2Ffuzzyjson/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/simdjson","download_url":"https://codeload.github.com/simdjson/fuzzyjson/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simdjson%2Ffuzzyjson/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265892206,"owners_count":23844987,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-06-02T11:35:55.872Z","updated_at":"2025-07-19T05:07:51.270Z","avatar_url":"https://github.com/simdjson.png","language":"C++","readme":"# FuzzyJson\nFuzzyJson compares the parsing of different json parsers. Its goal is to find bugs.\n\nIt currently supports [simdjson](https://github.com/lemire/simdjson), [rapidjson](https://github.com/Tencent/rapidjson/) and [sajson](https://github.com/chadaustin/sajson). Other parsers can be easily added.\n\n## Simple usage\n```\nmkdir build\ncd build\ncmake ..\nmake\n./fuzzytest --help\n./fuzzytest --size 100 --max_mutations 1000\n```\nThe argument --help displays the help menu to show all the options. Then we generate a 100 bytes random json document and compare the parsing with simdjson and rapidjson on 1000 mutations (sajson is deactivated since it accepts too much invalid utf-8).\n\nFuzzyJson is a lot more flexible when used as a library.\n\n# Advanced usage\nFuzzyJson uses [RandomJson](https://github.com/ioioioio/randomjson) to handle json documents. To create a fuzzyjson::FuzzyJson object, we must pass it a randomjson::Settings object.\n```C\n#include \"randomjson.h\"\n#include \"fuzzyjson.h\"\n\nint json_size = 100\nrandomjson::Settings json_settings(json_size);\nfuzzyjson::FuzzyJson fuzzy(json_settings);\n```\n\nFuzzyJson can be configurated with a fuzzyjson::settings\n```C\nfuzzyjson::Settings fuzzy_settings;\nfuzzy_settings.max_mutations = 5000;\nfuzzyjson::FuzzyJson fuzzy2(json_settings, fuzzy_settings);\n```\n\nOnce the fuzzyjson::FuzzyJson object is created, we must pass it the parsers we want to use.\n```C\n#include \"simdjsonparser.h\"\n#include \"rapidjsonparser.h\"\n\nfuzzy.add_parser(std::make_unique\u003cRapidjsonParser\u003e());\nfuzzy.add_parser(std::make_unique\u003cSimdjsonParser\u003e());\n```\nNote that FuzzyJson will currently crash if there is only one parser. The desired behaviour for that case has not been decided yet.\n\nThe only thing left to do is to start the fuzz.\n```C\nfuzzy.fuzz();\n```\n\n## Reports\nWhen a difference between parsings is detected, FuzzyJson generates a report and makes a copy of the json. The report gives information about how the json document has been generated and where the difference between parsings have been detected.\n\nInstead of a copy of the json that caused a problem, it might sound preferable to regenerate the document from its settings (size, seeds, mutations, etc.). However, FuzzyJson revert all the mutations that generated an invalid documents. That means we would need a way to retrieve the mutations to skip. That would be easy to implement, the problem is that there are so many skipped mutations that the reports would likely become as big or bigger than the json document. \n\nFuzzyJson saves the current parsed json in a document called temp.json. Though it might sound inefficient, it is important to have a copy of the json that caused a problem in case of an unexpected crash, and no better solution has been found yet. If no crash occurred, the temp.json is deleted at the end of the execution.\n\n## Add a new parser\nTo add a new parser, one must implements the class fuzzyjson::Parser in [include/parser.h](https://github.com/ioioioio/fuzzyjson/blob/master/include/parser.h). There are two examples in [include/simdjsonparser.h](https://github.com/ioioioio/fuzzyjson/blob/master/include/simdjsonparser.h) and [include/rapidjsonparser.h](https://github.com/ioioioio/fuzzyjson/blob/master/include/rapidjsonparser.h).\n\nThe Parser constructor requires a name to be given to the parser. It is used in the reports to identify the parser.\n\nIn order to make the parser work, one virtual function must be implemented: parse(). That function returns a fuzzyjson::Traverser. Though it is not mandatory, it has been found convenient to return a fuzzyjson::InvalidTraverser when the parsing fails.\n\nThe tricky part is to implement the fuzzyjson::Traverser for the parser. The Traverser must traverse the json data in a precise way. The functions test_simple_nested_document_parsing() and test_one_number_document_parsing() int [tests/unittests.cpp](https://github.com/ioioioio/fuzzyjson/blob/master/tests/unittests.cpp) show how it must be done. It is suggested to use them to assert the Traverser works properly.\n\n## Multiprocessing\nIt is suggested to launch FuzzyJson in multiple processes. When FuzzyJson is launched in multiple processes, it is recommended to give each instance an id. Otherwise, each instances will overwrite the same temp.json (and possibly reports made on the same nanosecond), and useful information could be lost.\n\nThe \"simple\" way:\n```\n./fuzzytest --size 100 --max_mutations 1000 --id 1\n```\n\nThe \"advanced\" way:\n```C\nfuzzyjson::Settings fuzzy_settings3;\nfuzzy_settings3.id = 3;\nfuzzyjson::FuzzyJson fuzzy3(json_settings, fuzzy_settings);\n```\n\n[run.py](https://github.com/ioioioio/fuzzyjson/blob/master/run.py) is already configured to launch a process on each processor on the machine. Though it is not showed in the Python script, to launch FuzzyJson like this could be particularly useful to handle crashes.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsimdjson%2Ffuzzyjson","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsimdjson%2Ffuzzyjson","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsimdjson%2Ffuzzyjson/lists"}