{"id":16209266,"url":"https://github.com/gilzoide/pega-texto","last_synced_at":"2025-03-19T08:31:01.732Z","repository":{"id":70590439,"uuid":"96050440","full_name":"gilzoide/pega-texto","owner":"gilzoide","description":"Single-file Parsing Expression Grammars (PEG) runtime engine for C","archived":false,"fork":false,"pushed_at":"2022-04-29T22:24:01.000Z","size":819,"stargazers_count":19,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-03-17T05:11:24.245Z","etag":null,"topics":["header-only","parser","parsing","parsing-expression-grammars","peg","single-file","single-header"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"unlicense","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gilzoide.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":null,"patreon":null,"open_collective":null,"ko_fi":"gilzoide","tidelift":null,"community_bridge":null,"liberapay":"gilzoide","issuehunt":null,"otechie":null,"custom":["https://www.buymeacoffee.com/gilzoide"]}},"created_at":"2017-07-02T21:40:23.000Z","updated_at":"2025-02-19T18:18:27.000Z","dependencies_parsed_at":"2023-04-23T20:04:37.856Z","dependency_job_id":null,"html_url":"https://github.com/gilzoide/pega-texto","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gilzoide%2Fpega-texto","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gilzoide%2Fpega-texto/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gilzoide%2Fpega-texto/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gilzoide%2Fpega-texto/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gilzoide","download_url":"https://codeload.github.com/gilzoide/pega-texto/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244389792,"owners_count":20445002,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["header-only","parser","parsing","parsing-expression-grammars","peg","single-file","single-header"],"created_at":"2024-10-10T10:28:58.951Z","updated_at":"2025-03-19T08:31:01.727Z","avatar_url":"https://github.com/gilzoide.png","language":"C","funding_links":["https://ko-fi.com/gilzoide","https://liberapay.com/gilzoide","https://www.buymeacoffee.com/gilzoide"],"categories":[],"sub_categories":[],"readme":"# pega-texto\n\nSingle-file [Parsing Expression Grammars (PEG)](http://bford.info/packrat/) runtime engine for C.\n\nTo use it, copy [pega-texto.h](pega-texto.h) file to your project and `#define PEGA_TEXTO_IMPLEMENTATION`\nbefore including it in **one** C or C++ source file to create the implementation.\n\n\n## Usage example\n\n```c\n#include \u003cstdio.h\u003e\n\n// 1. #define PEGA_TEXTO_IMPLEMENTATION on exactly one C/C++ file and include pega-texto.h\n// Optionally #define other compile-time options, check out pega-texto.h for documentation\n#define PT_DEFINE_SHORTCUTS  // shortcuts make creating grammars more readable\n#define PT_DATA size_t  // define action result data\n#define PEGA_TEXTO_IMPLEMENTATION\n#include \"pega-texto.h\"\n\n// This will return the length for each line we parse\nPT_DATA line_length(const char *str, size_t size, int argc, PT_DATA *argv, void *userdata) {\n    return size;\n}\n// This will return the longest line length\nPT_DATA longest_line(const char *str, size_t size, int argc, PT_DATA *argv, void *userdata) {\n    size_t longest = 0;\n    for(int i = 0; i \u003c argc; i++) {\n        if(argv[i] \u003e longest) {\n            longest = argv[i];\n        }\n    }\n    return longest;\n}\n\n// Helpers for getting error locations\ntypedef struct text_location { int line; int column; } text_location;\ntext_location get_text_location(const char *str, size_t where) {\n    text_location location = { 1, 1 };\n    for(size_t i = 0; i \u003c where; i++) {\n        if(str[i] == '\\n') {\n            location.line++;\n            location.column = 1;\n        }\n        else {\n            location.column++;\n        }\n    }\n    return location;\n}\n// This will be called when matching an invalid quotation mark\nvoid invalid_quote(const char *str, size_t where, void *userdata) {\n    text_location location = get_text_location(str, where);\n    printf(\"Error: invalid quotation mark in middle of cell at %d:%d\\n\", location.line, location.column);\n}\n// This will be called when no quotation mark is found closing an open quoted cell\nvoid unmatched_quote(const char *str, size_t where, void *userdata) {\n    while(where \u003e 1 \u0026\u0026 !(str[where] == '\"' \u0026\u0026 str[where - 1] != '\"')) where--;\n    text_location location = get_text_location(str, where);\n    printf(\"Error: Quotation mark starting cell at %d:%d is not closed\\n\", location.line, location.column);\n}\n\n// 2. Create some grammar rules, which are arrays of expressions\n// Defining indices in an `enum` makes referencing rules clearer\nenum rule_indices {\n    R_CSV,\n    R_LINE,\n    R_CELL,\n    R_QUOTED,\n    R_EOL,\n};\n// Rules must have a terminating `PT_END()` operation,\n// using `PT_RULE(...)` ensures we don't forget about it\npt_rule CSV = PT_RULE(\n    // after the whole match succeeds, `longest_line` will be called,\n    // receiving all inner `line_length` results on argv\n    ACTION(longest_line,\n        ZERO_OR_MORE(CALL(R_LINE))\n    )\n);\npt_rule LINE = PT_RULE(\n    // after the whole match succeeds, `line_length` will be called for each line\n    ACTION(line_length,\n        CALL(R_CELL),\n        ZERO_OR_MORE(B(','), CALL(R_CELL))\n    ),\n    EITHER(\n        CALL(R_EOL),\n        B('\\0') // EOF\n    )\n);\npt_rule CELL = PT_RULE(\n    EITHER(\n        CALL(R_QUOTED),\n        ONE_OR_MORE(\n            ERROR_IF(invalid_quote, B('\"')),\n            ANY_BUT(S(\"\\\",\\r\\n\"))\n        )\n    )\n);\npt_rule QUOTED = PT_RULE(\n    B('\"'),\n    ZERO_OR_MORE(\n        EITHER(\n            L(\"\\\"\\\"\"),\n            ANY_BUT(B('\"'))\n        )\n    ),\n    EITHER(\n        B('\"'),\n        ERROR(unmatched_quote)\n    )\n);\npt_rule EOL = PT_RULE(\n    OPTIONAL(B('\\r')),\n    B('\\n')\n);\n\n// 3. Create your grammar\n// Grammars are just arrays of rules\npt_grammar G = {\n    [R_CSV] = CSV,\n    [R_LINE] = LINE,\n    [R_CELL] = CELL,\n    [R_QUOTED] = QUOTED,\n    [R_EOL] = EOL,\n};\n\nint main(int argc, const char **argv) {\n    const char *csv_text = \"first,second,third\\n1,\\\"2\\\",3\";\n    // 4. Call the match algorithm!\n    pt_match_result result = pt_match(G, csv_text, NULL);\n    switch(result.matched) {\n        case PT_NO_MATCH:\n            printf(\"Match failed! Invalid CSV content\\n\");\n            break;\n\n        case PT_NO_STACK_MEM:\n            printf(\"Match failed! Out of memory\\n\");\n            break;\n\n        case PT_MATCHED_ERROR:\n            printf(\"Match failed! Error matched\\n\");\n            break;\n\n        case PT_NULL_INPUT:\n            printf(\"Match failed! NULL CSV content\\n\");\n            break;\n\n        default:\n            printf(\"Matched CSV content with %d bytes.\\n\", result.matched);\n            printf(\"Longest line is %lu bytes long\\n\", result.data);\n            break;\n    }\n    return 0;\n}\n\n```\n\n\nChange log\n----------\n+ 4.0.0 - Refactor project as a single header implementation,\n  split Quantifier Expressions into At Least and At Most expressions,\n  make `PT_DATA` type configurable, add Action Expressions to\n  simplify implementation and make code more readable, reimplement\n  match algorithm with a recursive approach, make Expressions be\n  layed out contiguously in memory and remove heap based creation\n  and destruction of them, make grammar literal definable at\n  compile-time.\n+ 3.0.0 - Change actions to receive the capture with pointer and size, instead\n  of pointer to input string, start and end of capture; add Byte expression,\n  change Character Class Expressions to use only functions defined in `ctype.h`,\n  change Range Expressions to use 2 bytes instead of a NULL terminated string;\n  use Grammars without malloc'ing them.\n+ 2.1.0 - Populate `pt_match_result.data.i` with the first syntactic error code\n  when syntactic errors occur.\n+ 2.0.1 - Put `extern \"C\"` declarations in inner headers.\n+ 2.0.0 - ABI change on `pt_match_options`, included Case Insensitive and\n  Character Class Expressions (the old Custom Matcher), changed Custom Matcher\n  Expressions to allow operating on strings, also receiving userdata.\n+ 1.2.7 - Removed all the Action sequence computation, as Actions are already\n  stacked in the right sequence. Running actions is now iterative, O(n) and use\n  far less memory.\n+ 1.2.6 - Fixed `SEQ` and `OR` expression macros on C++, turns out they behave\n  differently about temporary lifetime of arrays.\n+ 1.2.5 - Fixed `SEQ` and `OR` expression macros to compile on both C and C++\n  using preprocessor macros and `initializer_list` directly on `macro-on.h`.\n+ 1.2.4 - Added `extern \"C\"` declaration on `pega-texto.h` for using in C++. \n+ 1.2.3 - Fixed validation error code emmited when `pt_is_nullable` returned\n  true, as it may find an error other than `PT_VALIDATE_LOOP_EMPTY_STRING`.\n+ 1.2.2 - Added `NULL` string check on match.\n+ 1.2.1 - Fixed validation error on empty `SEQ` and `OR` Expressions, which\n  are valid with a `NULL` pointer.\n+ 1.2.0 - Macros for Expressions to not own memory buffers, empty `SEQ` and\n  `OR` Expressions don't allocate a 0-byte buffer anymore, fixed validation\n  error on Non-terminal cycles.\n+ 1.1.1 - Fixed validation error on Non-terminal cycles.\n+ 1.1.0 - Added basic error handling support.\n+ 1.0.0 - Expressions, Grammars, parsing, validation, actions.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgilzoide%2Fpega-texto","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgilzoide%2Fpega-texto","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgilzoide%2Fpega-texto/lists"}