{"id":15168552,"url":"https://github.com/antononcube/raku-text-subparsers","last_synced_at":"2026-01-31T03:30:58.666Z","repository":{"id":183538672,"uuid":"670323545","full_name":"antononcube/Raku-Text-SubParsers","owner":"antononcube","description":"Raku package for extracting and processing of interpret-able sub-strings in texts.","archived":false,"fork":false,"pushed_at":"2023-09-03T22:34:48.000Z","size":55,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-09T15:50:48.336Z","etag":null,"topics":["interpreters","large-language-models","llm","parsers","raku","rakulang"],"latest_commit_sha":null,"homepage":"https://raku.land/zef:antononcube/Text::SubParsers","language":"Raku","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"artistic-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/antononcube.png","metadata":{"files":{"readme":"README-work.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-07-24T19:44:47.000Z","updated_at":"2024-06-20T04:13:02.000Z","dependencies_parsed_at":null,"dependency_job_id":"f764fd4b-5fc5-4ea0-8935-3f3e70cab6c6","html_url":"https://github.com/antononcube/Raku-Text-SubParsers","commit_stats":{"total_commits":43,"total_committers":1,"mean_commits":43.0,"dds":0.0,"last_synced_commit":"b1a4f4f1b2c73de1a8a76833a189d381836f9ce2"},"previous_names":["antononcube/raku-text-subparsers"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/antononcube%2FRaku-Text-SubParsers","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/antononcube%2FRaku-Text-SubParsers/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/antononcube%2FRaku-Text-SubParsers/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/antononcube%2FRaku-Text-SubParsers/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/antononcube","download_url":"https://codeload.github.com/antononcube/Raku-Text-SubParsers/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/antononcube%2FRaku-Text-SubParsers/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259164660,"owners_count":22815400,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["interpreters","large-language-models","llm","parsers","raku","rakulang"],"created_at":"2024-09-27T06:22:24.603Z","updated_at":"2026-01-31T03:30:58.640Z","avatar_url":"https://github.com/antononcube.png","language":"Raku","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Text::SubParsers\n\nRaku package for extracting and processing of interpret-able sub-strings in texts.\n\nThe primary motivation for creating this package is the post-processing of the outputs of\nLarge Language Models (LLMs), [AA1, AAp1, AAp2, AAp3].\n\n## Installation\n\nFrom Zef ecosystem:\n\n```\nzef install Text::SubParsers\n```\n\nFrom GitHub:\n\n```\nzef install https://github.com/antononcube/Raku-Text-SubParsers.git\n```\n\n------\n\n## Usage examples\n\n### Date extractions\n\nHere we extract dates from a text:\n\n```perl6\nuse Text::SubParsers;\nmy $res = \"Openheimer's birthday is April 22, 1905 or April 2, 1905, as far as I know.\";\n\nText::SubParsers::Core.new('DateTime').subparse($res).raku;\n```\n\nCompare with the result of the `parse` method over the same text:\n\n```perl6\nsay Text::SubParsers::Core.new('DateTime').parse($res);\n```\n\nHere are the results of both `subparse` and `parse` on string that is a valid date specification:\n\n```perl6\nText::SubParsers::Core.new('DateTime').subparse('April 22, 1905');\n```\n\n```perl6\nText::SubParsers::Core.new('DateTime').parse('April 22, 1905');\n```\n\n### Sub-parsing with user supplied subs\n\nInstead of using `Text::SubParsers::Core.new` the functions `sub-parser` and `exact-parser`\ncan be used.\n\nHere is an example of using:\n- Invocation of `sub-parser`\n- (Sub-)parsing with a user supplied function (sub)\n\n```perl6\nsub known-cities(Str $x) { \n    $x ∈ ['Seattle', 'Chicago', 'New York', 'Sao Paulo', 'Miami', 'Los Angeles'] ?? $x.uc !! Nil \n}\n\nsub-parser(\u0026known-cities).subparse(\"\n1. New York City, NY - 8,804,190\n2. Los Angeles, CA - 3,976,322\n3. Chicago, IL - 2,746,388\n4. Houston, TX - 2,304,580\n5. Philadelphia, PA - 1,608,162\n6. San Antonio, TX - 1,5\n\")\n```\n\nHere is the \"full form\" of the last result\n\n```perl6\n_.raku\n```\n\n### Sub-parsing with `WhateverCode`\n\nWith the parser spec `WhateverCode` an attempt is made to extract dates, JSON expressions, numbers, and Booleans (in that order).\nHere is an example:\n\n```perl6\nsub-parser(WhateverCode).subparse('\nIs it true that the JSON expression {\"date\": \"2023-03-08\", \"rationalNumber\": \"11/3\"} contains the date 2023-03-08 and the rational number 11/3?\n').raku\n```\n\n### Different types of input\n\nThe input given to the sub-parsers can be a:\n\n- String\n- Array of strings\n- Map with string values\n\nHere is an example with an array of strings:\n\n```perl6\nsub-parser(WhateverCode).subparse(['{a:3, y:45}', \"2023-08-06\", \"Mass 1,503lbs\"]).raku\n```\n\nHere is an example with a Map:\n\n```perl6\nsub-parser('JSON').subparse({1 =\u003e '{ \"ui\" : 3, \"io\" : 78}', 2 =\u003e '{ \"GA\" : 34, \"CA\" : 178}'}).raku\n```\n\n\n------\n\n## Failed parsing\n\nIf the given texts cannot be parsed `Failure` objects are returned.\nThis allows the payload of failure's `Exception` object to be examined and see the inputs to the sub-parsers:\n\n```perl6\nmy $fres = sub-parser(DateTime).subparse('Some date [1930, 2, 14].');\n$fres.raku\n```\n\nHere is the structure of the exception's payload:\n\n```perl6\n$fres.exception.payload\n```\n\nUsing a *soft* `Exception` (i.e. a `Failure` object) is useful when\n(i) the sub-parsing is part of a certain pipeline of operations *and*\n(ii) the input to the sub-parser is \"hard to compute\" (the result of a lengthy or expensive computation.)\nInstead of just giving a message \"cannot parse\" or similar the returned `Failure` object \nallows examination of the input and error.\n\n------\n\n## Processing LLM outputs\n\nAs it was mentioned above, the primary motivation for creating this package is the post-processing of the outputs of\nLarge Language Models (LLMs), [AA1, AAp1, AAp2, AAp3].\n\nHere is an example of creating a LLM-function and its invocation over a string:\n\n```perl6\nuse LLM::Functions;\n\nmy \u0026fs = llm-function(\n        {\"What is the average speed of $_ ?\"},\n        llm-evaluator =\u003e llm-configuration(\n                'PaLM',\n                prompts =\u003e 'You are knowledgeable engineer and you give concise, numeric answers.'));\n\nsay \u0026fs('car in USA highway');\n```\n\nHere is the corresponding interpretation using sub-parsers:\n\n```perl6\nsub-parser('Numeric').subparse(_.trim).raku;\n```\n\nHere is a more involved example in which:\n\n1. An LLM is asked to produce a certain set of events in JSON format\n2. The JSON fragment of the result is parsed \n3. The obtained list of hashes is transformed into [Mermaid-JS timeline diagram](https://mermaid.js.org/syntax/timeline.html)\n\n\n```perl6\nmy \u0026ft = llm-function(\n        {\"What are the $^a most significant events of $^b? Give the answer with date-event pairs in JSON format.\"},\n        form =\u003e sub-parser('JSON'),\n        llm-evaluator =\u003e llm-configuration('PaLM', max-tokens =\u003e 500));\n\nmy @ftRes = |\u0026ft(9, 'WWI');\n@ftRes = @ftRes.grep({ $_ !~~ Str });\n```\n\n```perl6, output.lang=mermaid, output.prompt=NONE\nmy @timeline = ['timeline', 'title WW1 events'];\nfor @ftRes -\u003e $record {\n    @timeline.append( \"{$record\u003cdate\u003e} : {$record\u003cevent\u003e}\");\n}\n@timeline.join(\"\\n\\t\")\n```\n\n------\n\n## References\n\n### Articles\n\n[AA1] Anton Antonov,\n[\"LLM::Functions\"](https://rakuforprediction.wordpress.com/2023/07/21/llmfunctions/),\n(2023),\n[RakuForPrediction at WordPress](https://rakuforprediction.wordpress.com).\n\n### Packages\n\n[AAp1] Anton Antonov,\n[LLM::Functions Raku package](https://github.com/antononcube/Raku-LLM-Functions),\n(2023),\n[GitHub/antononcube](https://github.com/antononcube).\n\n[AAp2] Anton Antonov,\n[WWW::OpenAI Raku package](https://github.com/antononcube/Raku-WWW-OpenAI),\n(2023),\n[GitHub/antononcube](https://github.com/antononcube).\n\n[AAp3] Anton Antonov,\n[WWW::PaLM Raku package](https://github.com/antononcube/Raku-WWW-PaLM),\n(2023),\n[GitHub/antononcube](https://github.com/antononcube).\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fantononcube%2Fraku-text-subparsers","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fantononcube%2Fraku-text-subparsers","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fantononcube%2Fraku-text-subparsers/lists"}