{"id":18086092,"url":"https://github.com/guidanoli/peg-coq","last_synced_at":"2025-04-06T00:43:21.450Z","repository":{"id":153804733,"uuid":"623654198","full_name":"guidanoli/peg-coq","owner":"guidanoli","description":"Formalizing PEGs and a well-formedness algorithm in Coq","archived":false,"fork":false,"pushed_at":"2025-04-01T16:46:05.000Z","size":344,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-01T17:49:12.301Z","etag":null,"topics":["coq","lpeg","parsing-expression-grammars","well-formed"],"latest_commit_sha":null,"homepage":"","language":"Coq","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/guidanoli.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-04-04T20:05:05.000Z","updated_at":"2025-04-01T16:46:09.000Z","dependencies_parsed_at":"2023-09-24T21:54:49.426Z","dependency_job_id":"5fc582c1-53a8-47a6-8f17-c724dea5e833","html_url":"https://github.com/guidanoli/peg-coq","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/guidanoli%2Fpeg-coq","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/guidanoli%2Fpeg-coq/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/guidanoli%2Fpeg-coq/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/guidanoli%2Fpeg-coq/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/guidanoli","download_url":"https://codeload.github.com/guidanoli/peg-coq/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247419815,"owners_count":20936012,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["coq","lpeg","parsing-expression-grammars","well-formed"],"created_at":"2024-10-31T16:06:28.971Z","updated_at":"2025-04-06T00:43:21.443Z","avatar_url":"https://github.com/guidanoli.png","language":"Coq","funding_links":[],"categories":[],"sub_categories":[],"readme":"## Formalization of PEGs using the Coq proof assistant\n\nThis repository formalizes the syntax and the semantics of [PEGs]\nusing the [Coq] proof assistant.\nIt also formalizes the algorithm implemented in [LPeg]\nfor detecting well-formed PEGs,\nand proves that it terminates and is correct.\nCorrectness here implies in termination of the parsing procedure.\nIt also formalizes the definition of \"first\" implemented in [LPeg],\nused to optimize certain types of patterns such as choices.\nWe also prove the choice optimization is correct.\n\n## Syntax\n\nWe define use the desugared syntax of PEGs\nfirst established by Ford.\nIt contains the empty pattern, terminals,\nnon-terminals, repetitions, not-predicates,\nsequences, and choices.\nFrom these, we can construct more complex patterns\nsuch as character classes from choices,\nand and-predicates from not-predicates.\nThis simplification helps us formalize PEGs\nwithout losing expressiveness.\n\n## Semantics\n\nWe define the match procedure both\nas a fixed point function which takes\na gas counter and returns an optional result,\nand an inductive predicate.\nThe fixed point definition is interesting\nfor proving termination,\nand the inductive predicate is useful\nfor proving correctness,\nand other complex results\nwhich may benefit from proofs by induction.\n\n## Well-formedness\n\nWe define the algorithm implemented in [LPeg]\nthat detects well-formed PEGs,\na subset of complete PEGs whose\ndetection is decidable.\nFord proved that, in the general case,\nthe problem of identifying complete PEGs\nis undecidable, so this is a conservative\napproach that ensures termination.\n\n## First\n\nWe loosely define here \"first\" as a conservative set of\ncharacters that a string can start in order to match a pattern.\nTo be more precise, if a string starts with a character\nthat is not in the first set of a pattern,\nthen the match is guaranteed to fail.\nTo take into consideration the match against empty strings,\nthe function that computes the first set also outputs a Boolean value\nwhich indicates whether the pattern may match the empty string or not.\nIf it returns false, then the pattern fails to match the empty string.\nOtherwise, it may or may not match the empty string.\nBesides the input grammar and pattern,\nthe function also takes a character set called \"follow\" as parameter.\nThe purpose of this parameter is to properly define first for sequence patterns.\nIn the sequence `p1; p2`, we use the first of `p2` as the follow of `p1`.\n\nThis definition is used by LPeg to optimize patterns such as choices.\nGiven a choice pattern `p1 / p2`, if `p1` does not match the empty string,\nand the first sets of `p1` and `p2` are disjoint,\nthen LPeg converts the choice pattern into `\u0026[cs1] p1 / ![cs1] p2`,\nwhere `cs1` is the first set of `p1`.\nIn reality, this is only a depiction of the optimization using standard PEG notation,\nwhile LPeg actually performs this optimization at the virtual machine instruction level.\nFor this optimization,\nLPeg inserts a test instruction, which checks whether the next character\nis in a given character set or not, and jumps to another instruction otherswise.\nThis optimization makes matching against this kind of pattern more performant.\nHere, we prove that this optimization is indeed correct.\n\n## Files\n\nThe main files are:\n\n- `Syntax.v`: PEG syntax\n- `Match.v`: PEG semantics\n- `Coherent.v`: valid nonterminals\n- `Verifyrule.v`: left-recursive rules\n- `Nullable.v`: nullable patterns\n- `Checkloops.v`: degenerate loops\n- `Verifygrammar.v`: well-formedness\n- `First.v`: definition of first (used for optimization)\n\nAuxiliar files are:\n\n- `Tactics.v`: auxiliary proof tactics\n- `Pigeonhole.v`: states and proves the pigeonhole principle\n- `Strong.v`: states and proves the strong induction primitive\n- `Suffix.v`: defines the suffix and proper suffix relations, and proves some results about them\n- `Charset.v`: defines character sets and operations on them, and proves some lemmas about them\n- `Startswith.v`: defines the \"starts with\" function, and proves some lemmas about it\n- `Equivalent.v`: defines equivalent patterns, proves some lemmas about them, and examples\n\n## Building the project\n\nIn order to build the project, you first need to make sure you have the following dependencies installed on your system.\n\n- Coq 8.18.0 or later\n- GNU Make\n\nThen, you can build the project by running `make` on the root of the project.\n\n[PEGs]: https://doi.org/10.1145/964001.964011\n[Coq]: https://coq.inria.fr/\n[LPeg]: https://www.inf.puc-rio.br/~roberto/lpeg/\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fguidanoli%2Fpeg-coq","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fguidanoli%2Fpeg-coq","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fguidanoli%2Fpeg-coq/lists"}