{"id":17495744,"url":"https://github.com/iraikov/chicken-lexgen","last_synced_at":"2026-02-26T01:33:25.427Z","repository":{"id":148366559,"uuid":"148259593","full_name":"iraikov/chicken-lexgen","owner":"iraikov","description":"Lexer and parser combinators in Chicken Scheme","archived":false,"fork":false,"pushed_at":"2021-10-04T22:06:57.000Z","size":25,"stargazers_count":5,"open_issues_count":1,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-01-30T16:28:27.686Z","etag":null,"topics":["chicken-scheme","chicken-scheme-eggs","lexer","lexer-parser","parser-combinators","pattern-matcher","scheme","scheme-language","scheme-programming-language"],"latest_commit_sha":null,"homepage":null,"language":"Scheme","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/iraikov.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-09-11T04:20:25.000Z","updated_at":"2023-10-23T11:55:13.000Z","dependencies_parsed_at":"2023-05-19T21:30:41.239Z","dependency_job_id":null,"html_url":"https://github.com/iraikov/chicken-lexgen","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iraikov%2Fchicken-lexgen","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iraikov%2Fchicken-lexgen/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iraikov%2Fchicken-lexgen/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iraikov%2Fchicken-lexgen/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/iraikov","download_url":"https://codeload.github.com/iraikov/chicken-lexgen/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245525811,"owners_count":20629832,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chicken-scheme","chicken-scheme-eggs","lexer","lexer-parser","parser-combinators","pattern-matcher","scheme","scheme-language","scheme-programming-language"],"created_at":"2024-10-19T14:26:42.804Z","updated_at":"2026-02-26T01:33:20.370Z","avatar_url":"https://github.com/iraikov.png","language":"Scheme","funding_links":[],"categories":[],"sub_categories":[],"readme":"# chicken-lexgen\nLexer and parser combinators in Chicken Scheme\n\n## Description\n\n`lexgen` is a lexer generator comprised in its core of only five\nsmall procedures that can be combined to form pattern matchers.\n\nA pattern matcher procedure takes an input stream, and returns a new\nstream advanced by the pattern.\n\nA stream is defined as a list that contains a list of characters\nconsumed by the pattern matcher, and a list of characters not yet\nconsumed. E.g., the list\n\n  ((#\\a) (#\\b #\\c #\\d #\\e))\n\nrepresents a stream that contains the consumed character a, and the\nunconsumed characters b c d e.\n\nA pattern matcher has the form of a procedure that takes a success\ncontinuation, which is invoked when the pattern matches and the stream\nis advanced, an error continuation, which is invoked when the pattern\ndoes not match, and an input stream.\n\n## Library Procedures\n\nEvery combinator procedure in this library returns a procedure that\ntakes in a success continuation, error continuation and input stream\nas arguments.\n\n### Basic procedures\n\n    \u003cprocedure\u003e(seq MATCHER1 MATCHER2) =\u003e MATCHER\u003c/procedure\u003e\n\n`seq` builds a matcher that matches a sequence of patterns. \n\n\u003cprocedure\u003e(bar MATCHER1 MATCHER2) =\u003e MATCHER\u003c/procedure\u003e\n\n`bar` matches either of two patterns. It's analogous to patterns\nseparated by `|` in traditional regular expressions.\n\n\u003cprocedure\u003e(star MATCHER) =\u003e MATCHER\u003c/procedure\u003e\n\n`star` is an implementation of the Kleene closure. It is analogous\nto `*` in traditional regular expressions.\n\n### Token procedure\n\n\u003cprocedure\u003e(tok TOKEN PROC) =\u003e MATCHER\u003c/procedure\u003e\n\nProcedure `tok` builds pattern matchers based on character comparison\noperations. It is intended for matching input sequences that are\nSRFI-127 lazy streams.\n\nFor each stream given, `tok` applies the procedure `PROC` to the given\ntoken `TOKEN` and an input character. If the procedure returns a true\nvalue, that value is prepended to the list of consumed elements, and\nthe input character is removed from the stream of input elements.\n\n\n\u003cprocedure\u003e(char CHAR) =\u003e MATCHER\u003c/procedure\u003e\n\nMatches a single character.\n\n\u003cprocedure\u003e(set CHAR-SET) =\u003e MATCHER\u003c/procedure\u003e\n\nMatches any of a SRFI-14 set of characters. \n\n\u003cprocedure\u003e(range CHAR CHAR) =\u003e MATCHER\u003c/procedure\u003e\n\nMatches a range of characters. Analogous to character class `[]`.\n\n\u003cprocedure\u003e(lit STRING) =\u003e MATCHER\u003c/procedure\u003e\n\nMatches a literal string `s`.\n\n### Convenience procedures\n\nThese procedures are built from the basic procedures and are provided\nfor convenience.\n\n\u003cprocedure\u003e(try PROC) =\u003e PROC\u003c/procedure\u003e\n\nConverts a binary predicate procedure to a binary procedure that\nreturns its right argument when the predicate is true, and false\notherwise.\n\n\u003cprocedure\u003e(lst MATCHER-LIST) =\u003e MATCHER\u003c/procedure\u003e\n\nConstructs a matcher for the sequence of matchers in `MATCHER-LIST`.\n\n\u003cprocedure\u003e(pass) =\u003e MATCHER\u003c/procedure\u003e\n\nThis matcher returns without consuming any input.\n\n\u003cprocedure\u003e(pos MATCHER) =\u003e MATCHER\u003c/procedure\u003e\n\nPositive closure. Analogous to `+`.\n\n\u003cprocedure\u003e(opt MATCHER) =\u003e MATCHER\u003c/procedure\u003e\n\nOptional pattern. Analogous to `?`.\n\n\u003cprocedure\u003e(bind F P) =\u003e MATCHER\u003c/procedure\u003e\n\nGiven a rule `P` and function `F`, returns a matcher that first\napplies `P` to the input stream, then applies `F` to the returned\nlist of consumed tokens, and returns the result and the remainder of\nthe input stream.\n\nNote: this combinator will signal failure if the input stream is\nempty.\n\n\u003cprocedure\u003e(bind* F P) =\u003e MATCHER\u003c/procedure\u003e\n\nThe same as `bind`, but will signal success if the input stream is\nempty.\n\n\u003cprocedure\u003e(rebind F G P) =\u003e MATCHER\u003c/procedure\u003e\n\nGiven a rule `P` and procedures `F` and `G`, returns a matcher\nthat first applies `F` to the input stream, then applies `P` to\nthe resulting stream, then applies `G` to the resulting list of\nconsumed elements and returns the result along with the remainder of\nthe input stream.\n\nNote: this combinator will signal failure if the input stream is\nempty.\n\n\u003cprocedure\u003e(rebind* F G P) =\u003e MATCHER\u003c/procedure\u003e\n\nThe same as `rebind`, but will signal success if the input stream is\nempty.\n\n\u003cprocedure\u003e(drop P) =\u003e MATCHER\u003c/procedure\u003e\n\nGiven a rule `P`, returns a matcher that always returns an empty\nlist of consumed tokens when `P` succeeds.\n\n### Lexer procedure\n\n\u003cprocedure\u003e(lex MATCHER ERROR STRING) =\u003e CHAR-LIST\u003c/procedure\u003e\n\n`lex` takes a pattern and a string, turns the string into a list of\nstreams (containing one stream), applies the pattern, and returns the\nfirst possible match. Argument `ERROR` is a single-argument\nprocedure called when the pattern does not match anything.\n\n## Examples\n\n### A pattern to match floating point numbers\n\n```scheme\n\n;;  A pattern to match floating point numbers. \n;;  \"-\"?(([0-9]+(\\\\.[0-9]+)?)|(\\\\.[0-9]+))([eE][+-]?[0-9]+)? \n\n(define numpat\n  (let* ((digit        (range #\\0 #\\9))\n\t (digits       (pos digit))\n\t (fraction     (seq (char #\\.) digits))\n\t (significand  (bar (seq digits (opt fraction)) fraction))\n\t (exp          (seq (set \"eE\") (seq (opt (set \"+-\")) digits)))\n\t (sign         (opt (char #\\-))))\n    (seq sign (seq significand (opt exp)))))\n \n (define (err s)\n  (print \"lexical error on stream: \" s)\n  (list))\n\n (lex numpat err \"-123.45e-6\")\n```\n\n### A pattern to match floating point numbers and construct user-defined lexer state\n\n\n```scheme\n\n(define (collect cs) \n  (let loop ((cs cs) (ax (list)))\n    (cond ((null? cs)         `(,(list-\u003estring ax)))\n\t  ((atom? (car cs))   (loop (cdr cs) (cons (car cs) ax)))\n\t  (else               (cons (list-\u003estring ax) cs)))))\n\n(define (make-exp x)\n  (or (and (pair? x) \n\t   (let ((x1 (collect x)))\n\t     (list `(exp . ,x1)))) x))\n\n(define (make-significand x)\n  (or (and (pair? x) \n\t   (let ((x1 (collect x)))\n\t     (cons `(significand ,(car x1)) (cdr x1)))) x))\n\n(define (make-sign x)\n  (or (and (pair? x) \n\t   (let ((x1 (collect x)))\n\t     (cons `(sign ,(car x1)) (cdr x1)))) x))\n\n(define (check s) (lambda (s1) (if (null? s1) (err s) s1)))\n\n(define bnumpat \n  (let* ((digit        (range #\\0 #\\9))\n\t (digits       (star digit))\n\t (fraction     (seq (char #\\.) digits))\n\t (significand  (bar (seq digits (opt fraction)) fraction))\n\t (exp          (seq (set \"eE\") (seq (opt (set \"+-\")) digits)))\n\t (sign         (opt (char #\\-)) )\n\t (pat          (seq (bind make-sign sign) \n\t\t\t    (seq (bind make-significand significand)\n\t\t\t\t (bind make-exp (opt exp))))))\n    pat))\n\n(define (num-parser s) (car (lex bnumpat err s)))\n\n(num-parser \"-123.45e-6\")\n\n```\n\n## Version History\n\n* 8.2 Removed yasos dependency [thanks to Noel Cragg]\n* 8.1 Ported to CHICKEN 5 and yasos collections interface\n* 7.1 Bug fix in bind*  [thanks to Peter Bex]\n* 7.0 Added bind* and rebind* variants of bind and rebind [thanks to Peter Bex]\n* 6.1-6.2 Corrected behavior of the tok combinator so that the failure continuation is invoked upon end-of-input [thanks to Chris Salch]\n* 6.0 Using utf8 for char operations\n* 5.2 Ensure test script returns proper exit status\n* 5.0-5.1 Added error continuation to the matcher interface and eliminated multiple stream matching\n* 4.0 Implemented typeclass interface for abstracting over input sequences\n* 3.8 Added procedure `star*` (greedy Kleene closure matching)\n* 3.6 Added procedure redo [thanks to Christian Kellermann]\n* 3.5 Bug fixes in bind [reported by Peter Bex]\n* 3.3 Bug fixes in stream comparison\n* 3.2 Improved input stream comparison procedures\n* 3.1 Added rebind combinator and stream-unfold procedure \n* 3.0 Added an extension mechanism for input streams of different\n  types (to be elaborated and documented in subsequent versions).\n* 2.6 Added bind and drop combinators\n* 2.5 The seq combinator checks whether the first parser in the sequence has failed\n* 2.4 Added (require-library srfi-1); using lset\u003c= instead of equal? in star\n* 2.3 Bug fix in procedure range; added procedure cps-table\n* 2.2 Bug fix in procedure star\n* 2.1 Added procedure lst\n* 2.0 Core procedures rewritten in continuation-passing style\n* 1.5 Using (require-extension srfi-1)\n* 1.4 Ported to Chicken 4\n* 1.2 Added procedures try and tok (supersedes pred)\n* 1.0 Initial release\n\n## License\n\nBased on the [SML lexer generator](http://www.standarddeviance.com/projects/combinators/combinators.html) by Thant Tessman.\n\u003e\n\u003e  Copyright 2009-2021 Ivan Raikov.\n\u003e \n\u003e \n\u003e  This program is free software: you can redistribute it and/or modify\n\u003e  it under the terms of the GNU General Public License as published by\n\u003e  the Free Software Foundation, either version 3 of the License, or\n\u003e  (at your option) any later version.\n\u003e \n\u003e  This program is distributed in the hope that it will be useful, but\n\u003e  WITHOUT ANY WARRANTY; without even the implied warranty of\n\u003e  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU\n\u003e  General Public License for more details.\n\u003e \n\u003e  A full copy of the GPL license can be found at\n\u003e  \u003chttp://www.gnu.org/licenses/\u003e.\n\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Firaikov%2Fchicken-lexgen","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Firaikov%2Fchicken-lexgen","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Firaikov%2Fchicken-lexgen/lists"}