{"id":16906605,"url":"https://github.com/rsms/jsont","last_synced_at":"2025-04-11T15:26:52.798Z","repository":{"id":4417812,"uuid":"5555511","full_name":"rsms/jsont","owner":"rsms","description":"A minimal and portable JSON tokenizer for building highly effective and strict parsers (in C and C++)","archived":false,"fork":false,"pushed_at":"2021-11-03T06:32:39.000Z","size":180,"stargazers_count":25,"open_issues_count":2,"forks_count":3,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-04-11T12:41:50.814Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rsms.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2012-08-25T22:23:12.000Z","updated_at":"2023-09-08T16:34:58.000Z","dependencies_parsed_at":"2022-08-27T23:03:08.881Z","dependency_job_id":null,"html_url":"https://github.com/rsms/jsont","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rsms%2Fjsont","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rsms%2Fjsont/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rsms%2Fjsont/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rsms%2Fjsont/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rsms","download_url":"https://codeload.github.com/rsms/jsont/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248429941,"owners_count":21101922,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-13T18:43:31.995Z","updated_at":"2025-04-11T15:26:52.775Z","avatar_url":"https://github.com/rsms.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"# JSON Tokenizer (jsont)\n\nA minimal and portable JSON tokenizer written in standard C and C++ (two separate versions). Performs validating and highly efficient parsing suitable for reading JSON directly into custom data structures. There are no code dependencies — simply include `jsont.{h,hh,c,cc}` in your project.\n\nBuild and run unit tests:\n\n    make\n\n## Synopsis\n\nC API:\n\n```c\njsont_ctx_t* S = jsont_create(0);\njsont_reset(S, uint8_t* inbuf, size_t inbuf_len);\ntok = jsont_next(S)\n// branch on `tok` ...\nV = jsont_*_value(S[, ...]);\njsont_destroy(S);\n```\n\nNew C++ API:\n\n```cc\njsont::Tokenizer S(const char* inbuf, size_t length);\njsont::Token token;\nwhile ((token = S.next())) {\n  if (token == jsont::Float) {\n    printf(\"%g\\n\", S.floatValue());\n  } ... else if (t == jsont::Error) {\n    // handle error\n    break;\n  }\n}\n```\n\n```cc\njsont::Builder json;\njson.startObject()\n    .fieldName(\"foo\").value(123.45)\n    .fieldName(\"bar\").startArray()\n      .value(678)\n      .value(\"nine \\\"ten\\\"\")\n    .endArray()\n  .endObject();\nstd::cout \u003c\u003c json.toString() \u003c\u003c std::endl;\n// {\"foo\":123.45,\"bar\":[678,\"nine \\\"ten\\\"\"]}\n```\n\n# API overview\n\nSee `jsont.h` and `jsont.hh` for a complete overview of the API, incuding more detailed documentation. Here's an overview:\n\n## C++ API `namespace jsont`\n\n- `Builder build()` — convenience builder factory\n\n### class Tokenizer\n\nReads a sequence of bytes and produces tokens and values while doing so.\n\n- `Tokenizer(const char* bytes, size_t length, TextEncoding encoding)` — initialize a new Tokenizer to read `bytes` of `length` in `encoding`\n- `void reset(const char* bytes, size_t length, TextEncoding encoding)` — Reset the tokenizer, making it possible to reuse this parser so to avoid unnecessary memory allocation and deallocation.\n\n#### Reading tokens\n\n- `const Token\u0026 next() throw(Error)` — Read next token, possibly throwing an `Error`\n- `const Token\u0026 current() const` — Access current token\n\n#### Reading values\n\n- `bool hasValue() const` — True if the current token has a value\n- `size_t dataValue(const char const** bytes)` — Returns a slice of the input which represents the current value, or nothing (returns 0) if the current token has no value (e.g. start of an object).\n- `std::string stringValue() const` — Returns a *copy* of the current string value.\n- `double floatValue() const` — Returns the current value as a double-precision floating-point number.\n- `int64_t intValue() const` — Returns the current value as a signed 64-bit integer.\n\n#### Handling errors\n\n- `ErrorCode error() const` — Returns the error code of the last error\n- `const char* errorMessage() const` — Returns a human-readable message for the last error. Never returns NULL.\n\n#### Acessing underlying input buffer\n\n- `const char* inputBytes() const` — A pointer to the input data as passed to `reset` or the constructor.\n- `size_t inputSize() const` — Total number of input bytes\n- `size_t inputOffset() const` — The byte offset into input where the tokenizer is currently at. In the event of an error, this will point to the source of the error.\n\n### enum Token\n\n- `End` —           Input ended\n- `ObjectStart` —   {\n- `ObjectEnd` —     }\n- `ArrayStart` —    [\n- `ArrayEnd` —      ]\n- `True` —          true\n- `False` —         false\n- `Null` —          null\n- `Integer` —       number value without a fraction part (access as int64 through `Tokenizer::intValue()`)\n- `Float` —         number value with a fraction part (access as double through `Tokenizer::floatValue()`)\n- `String` —        string value (access value through `Tokenizer::stringValue()` et al)\n- `FieldName` —     field name (access value through `Tokenizer::stringValue()` et al)\n- `Error` —         an error occured (access error code through `Tokenizer::error()` et al)\n\n### enum TextEncoding\n\n- `UTF8TextEncoding` — Unicode UTF-8 text encoding\n\n### enum Tokenizer::ErrorCode\n\n- `UnspecifiedError` — Unspecified error\n- `UnexpectedComma` — Unexpected comma\n- `UnexpectedTrailingComma` — Unexpected trailing comma\n- `InvalidByte` — Invalid input byte\n- `PrematureEndOfInput` — Premature end of input\n- `MalformedUnicodeEscapeSequence` — Malformed Unicode escape sequence\n- `MalformedNumberLiteral` — Malformed number literal\n- `UnterminatedString` — Unterminated string\n- `SyntaxError` — Illegal JSON (syntax error)\n\n### class Builder\n\nAids in building JSON, providing a final sequential byte buffer.\n\n- `Builder()` — initialize a new builder with an empty backing buffer\n- `Builder\u0026 startObject()` — Start an object (appends a `'{'` character to the backing buffer)\n- `Builder\u0026 endObject()` — End an object (a `'}'` character)\n- `Builder\u0026 startArray()` — Start an array (`'['`)\n- `Builder\u0026 endArray()` — End an array (`']'`)\n- `const void reset()` — Reset the builder to its neutral state. Note that the backing buffer is reused in this case.\n\n#### Building\n\n- `Builder\u0026 fieldName(const char* v, size_t length, TextEncoding encoding=UTF8TextEncoding)` — Adds a field name by copying `length` bytes from `v`.\n- `Builder\u0026 fieldName(const std::string\u0026 name, TextEncoding encoding=UTF8TextEncoding)` — Adds a field name by copying `name`.\n- `Builder\u0026 value(const char* v, size_t length, TextEncoding encoding=UTF8TextEncoding)` — Adds a string value by copying `length` bytes from `v` which content is encoded according to `encoding`.\n- `Builder\u0026 value(const char* v)` — Adds a string value by copying `strlen(v)` bytes from c-string `v`. Uses the default encoding of `value(const char*,size_t,TextEncoding)`.\n- `Builder\u0026 value(const std::string\u0026 v)`  — Adds a string value by copying `v`. Uses the default encoding of `value(const char*,size_t,TextEncoding)`.\n- `Builder\u0026 value(double v)` — Adds a possibly fractional number\n- `Builder\u0026 value(int64_t v)`, `void value(int v)`, `void value(unsigned int v)`, `void value(long v)` — Adds an integer number\n- `Builder\u0026 value(bool v)` — Adds the \"true\" or \"false\" atom, depending on `v`\n- `Builder\u0026 nullValue()` — Adds the \"null\" atom\n\n#### Managing the result\n\n- `size_t size() const` — Number of readable bytes at the pointer returned by `bytes()`\n- `const char* bytes() const` — Pointer to the backing buffer, holding the resulting JSON.\n- `std::string toString() const` — Return a `std::string` object holding a copy of the backing buffer, representing the JSON.\n- `const char* seizeBytes(size_t\u0026 size_out)` — \"Steal\" the backing buffer. After this call, the caller is responsible for calling `free()` on the returned pointer. Returns NULL on failure. Sets the value of `size_out` to the number of readable bytes at the returned pointer. The builder will be reset and ready to use (which will act on a new backing buffer).\n\n----\n\n## C API\n\n### Types\n\n- `jsont_ctx_t` — A tokenizer context (\"instance\" in OOP lingo.)\n- `jsont_tok_t` — A token type (see \"Token types\".)\n- `jsont_err_t` — A user-configurable error type, which defaults to `const char*`.\n\n### Managing a tokenizer context\n\n- `jsont_ctx_t* jsont_create(void* user_data)` — Create a new JSON tokenizer context.\n- `void jsont_destroy(jsont_ctx_t* ctx)` — Destroy a JSON tokenizer context.\n- `void jsont_reset(jsont_ctx_t* ctx, const uint8_t* bytes, size_t length)` — Reset the tokenizer to parse the data pointed to by `bytes`.\n\n### Dealing with tokens\n\n- `jsont_tok_t jsont_next(jsont_ctx_t* ctx)` — Read and return the next token.\n- `jsont_tok_t jsont_current(const jsont_ctx_t* ctx)` — Returns the current token (last token read by `jsont_next`).\n\n### Accessing and comparing values\n\n- `int64_t jsont_int_value(jsont_ctx_t* ctx)` — Returns the current integer value.\n- `double jsont_float_value(jsont_ctx_t* ctx)` — Returns the current floating-point number value.\n- `size_t jsont_data_value(jsont_ctx_t* ctx, const uint8_t** bytes)` — Returns a slice of the input which represents the current value.\n- `char* jsont_strcpy_value(jsont_ctx_t* ctx)` — Retrieve a newly allocated c-string.\n- `bool jsont_data_equals(jsont_ctx_t* ctx, const uint8_t* bytes, size_t length)` — Returns true if the current data value is equal to `bytes` of `length`\n- `bool jsont_str_equals(jsont_ctx_t* ctx, const char* str)` — Returns true if the current data value is equal to c string `str`.\n\nNote that the data is not parsed until you call one of these functions. This means that if you know that a value transferred as a string will fit in a 64-bit signed integer, it's completely valid to call `jsont_int_value` to parse the string as an integer.\n\n### Miscellaneous\n\n- `uint8_t jsont_current_byte(jsont_ctx_t* ctx)` — Get the last byte read.\n- `size_t jsont_current_offset(jsont_ctx_t* ctx)` — Get the current offset of the last byte read.\n- `jsont_err_t jsont_error_info(jsont_ctx_t* ctx)` — Get information on the last error.\n- `void* jsont_user_data(const jsont_ctx_t* ctx)` — Returns the value passed to `jsont_create`\n\n### Token types\n\n- `JSONT_END` —            Input ended.\n- `JSONT_ERR` —            Error. Retrieve details through `jsont_error_info`\n- `JSONT_OBJECT_START` —   {\n- `JSONT_OBJECT_END` —     }\n- `JSONT_ARRAY_START` —    [\n- `JSONT_ARRAY_END` —      ]\n- `JSONT_TRUE` —           true\n- `JSONT_FALSE` —          false\n- `JSONT_NULL` —           null\n- `JSONT_NUMBER_INT` —     number value without a fraction part (access through `jsont_int_value` or `jsont_float_value`)\n- `JSONT_NUMBER_FLOAT` —   number value with a fraction part (access through `jsont_float_value`)\n- `JSONT_STRING` —         string value (access through `jsont_data_value` or `jsont_strcpy_value`)\n- `JSONT_FIELD_NAME` —     field name (access through `jsont_data_value` or `jsont_strcpy_value`)\n\n## Further reading\n\n- See `example*.c` for working sample programs.\n- See `LICENSE` for the MIT-style license under which this project is licensed.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frsms%2Fjsont","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frsms%2Fjsont","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frsms%2Fjsont/lists"}