Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/antononcube/raku-text-subparsers
Raku package for extracting and processing of interpret-able sub-strings in texts.
https://github.com/antononcube/raku-text-subparsers
interpreters large-language-models llm parsers raku rakulang
Last synced: 30 days ago
JSON representation
Raku package for extracting and processing of interpret-able sub-strings in texts.
- Host: GitHub
- URL: https://github.com/antononcube/raku-text-subparsers
- Owner: antononcube
- License: artistic-2.0
- Created: 2023-07-24T19:44:47.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-09-03T22:34:48.000Z (over 1 year ago)
- Last Synced: 2024-11-05T18:55:54.337Z (3 months ago)
- Topics: interpreters, large-language-models, llm, parsers, raku, rakulang
- Language: Raku
- Homepage: https://raku.land/zef:antononcube/Text::SubParsers
- Size: 53.7 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README-work.md
- License: LICENSE
Awesome Lists containing this project
README
# Text::SubParsers
Raku package for extracting and processing of interpret-able sub-strings in texts.
The primary motivation for creating this package is the post-processing of the outputs of
Large Language Models (LLMs), [AA1, AAp1, AAp2, AAp3].## Installation
From Zef ecosystem:
```
zef install Text::SubParsers
```From GitHub:
```
zef install https://github.com/antononcube/Raku-Text-SubParsers.git
```------
## Usage examples
### Date extractions
Here we extract dates from a text:
```perl6
use Text::SubParsers;
my $res = "Openheimer's birthday is April 22, 1905 or April 2, 1905, as far as I know.";Text::SubParsers::Core.new('DateTime').subparse($res).raku;
```Compare with the result of the `parse` method over the same text:
```perl6
say Text::SubParsers::Core.new('DateTime').parse($res);
```Here are the results of both `subparse` and `parse` on string that is a valid date specification:
```perl6
Text::SubParsers::Core.new('DateTime').subparse('April 22, 1905');
``````perl6
Text::SubParsers::Core.new('DateTime').parse('April 22, 1905');
```### Sub-parsing with user supplied subs
Instead of using `Text::SubParsers::Core.new` the functions `sub-parser` and `exact-parser`
can be used.Here is an example of using:
- Invocation of `sub-parser`
- (Sub-)parsing with a user supplied function (sub)```perl6
sub known-cities(Str $x) {
$x ∈ ['Seattle', 'Chicago', 'New York', 'Sao Paulo', 'Miami', 'Los Angeles'] ?? $x.uc !! Nil
}sub-parser(&known-cities).subparse("
1. New York City, NY - 8,804,190
2. Los Angeles, CA - 3,976,322
3. Chicago, IL - 2,746,388
4. Houston, TX - 2,304,580
5. Philadelphia, PA - 1,608,162
6. San Antonio, TX - 1,5
")
```Here is the "full form" of the last result
```perl6
_.raku
```### Sub-parsing with `WhateverCode`
With the parser spec `WhateverCode` an attempt is made to extract dates, JSON expressions, numbers, and Booleans (in that order).
Here is an example:```perl6
sub-parser(WhateverCode).subparse('
Is it true that the JSON expression {"date": "2023-03-08", "rationalNumber": "11/3"} contains the date 2023-03-08 and the rational number 11/3?
').raku
```### Different types of input
The input given to the sub-parsers can be a:
- String
- Array of strings
- Map with string valuesHere is an example with an array of strings:
```perl6
sub-parser(WhateverCode).subparse(['{a:3, y:45}', "2023-08-06", "Mass 1,503lbs"]).raku
```Here is an example with a Map:
```perl6
sub-parser('JSON').subparse({1 => '{ "ui" : 3, "io" : 78}', 2 => '{ "GA" : 34, "CA" : 178}'}).raku
```------
## Failed parsing
If the given texts cannot be parsed `Failure` objects are returned.
This allows the payload of failure's `Exception` object to be examined and see the inputs to the sub-parsers:```perl6
my $fres = sub-parser(DateTime).subparse('Some date [1930, 2, 14].');
$fres.raku
```Here is the structure of the exception's payload:
```perl6
$fres.exception.payload
```Using a *soft* `Exception` (i.e. a `Failure` object) is useful when
(i) the sub-parsing is part of a certain pipeline of operations *and*
(ii) the input to the sub-parser is "hard to compute" (the result of a lengthy or expensive computation.)
Instead of just giving a message "cannot parse" or similar the returned `Failure` object
allows examination of the input and error.------
## Processing LLM outputs
As it was mentioned above, the primary motivation for creating this package is the post-processing of the outputs of
Large Language Models (LLMs), [AA1, AAp1, AAp2, AAp3].Here is an example of creating a LLM-function and its invocation over a string:
```perl6
use LLM::Functions;my &fs = llm-function(
{"What is the average speed of $_ ?"},
llm-evaluator => llm-configuration(
'PaLM',
prompts => 'You are knowledgeable engineer and you give concise, numeric answers.'));say &fs('car in USA highway');
```Here is the corresponding interpretation using sub-parsers:
```perl6
sub-parser('Numeric').subparse(_.trim).raku;
```Here is a more involved example in which:
1. An LLM is asked to produce a certain set of events in JSON format
2. The JSON fragment of the result is parsed
3. The obtained list of hashes is transformed into [Mermaid-JS timeline diagram](https://mermaid.js.org/syntax/timeline.html)```perl6
my &ft = llm-function(
{"What are the $^a most significant events of $^b? Give the answer with date-event pairs in JSON format."},
form => sub-parser('JSON'),
llm-evaluator => llm-configuration('PaLM', max-tokens => 500));my @ftRes = |&ft(9, 'WWI');
@ftRes = @ftRes.grep({ $_ !~~ Str });
``````perl6, output.lang=mermaid, output.prompt=NONE
my @timeline = ['timeline', 'title WW1 events'];
for @ftRes -> $record {
@timeline.append( "{$record} : {$record}");
}
@timeline.join("\n\t")
```------
## References
### Articles
[AA1] Anton Antonov,
["LLM::Functions"](https://rakuforprediction.wordpress.com/2023/07/21/llmfunctions/),
(2023),
[RakuForPrediction at WordPress](https://rakuforprediction.wordpress.com).### Packages
[AAp1] Anton Antonov,
[LLM::Functions Raku package](https://github.com/antononcube/Raku-LLM-Functions),
(2023),
[GitHub/antononcube](https://github.com/antononcube).[AAp2] Anton Antonov,
[WWW::OpenAI Raku package](https://github.com/antononcube/Raku-WWW-OpenAI),
(2023),
[GitHub/antononcube](https://github.com/antononcube).[AAp3] Anton Antonov,
[WWW::PaLM Raku package](https://github.com/antononcube/Raku-WWW-PaLM),
(2023),
[GitHub/antononcube](https://github.com/antononcube).