{"id":23029681,"url":"https://github.com/antononcube/nlp-template-engine","last_synced_at":"2026-01-16T00:58:22.630Z","repository":{"id":48063678,"uuid":"402107765","full_name":"antononcube/NLP-Template-Engine","owner":"antononcube","description":"Natural Language Processing (NLP) template engine. (Using question answering systems and machine learning classifiers.)","archived":false,"fork":false,"pushed_at":"2024-05-02T22:48:05.000Z","size":4117,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-02-08T11:13:08.904Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Mathematica","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/antononcube.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-09-01T15:18:56.000Z","updated_at":"2024-05-02T22:48:08.000Z","dependencies_parsed_at":"2024-05-02T23:43:07.470Z","dependency_job_id":"aa81f0b7-f17a-47f5-b35b-da7f34c1be86","html_url":"https://github.com/antononcube/NLP-Template-Engine","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/antononcube%2FNLP-Template-Engine","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/antononcube%2FNLP-Template-Engine/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/antononcube%2FNLP-Template-Engine/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/antononcube%2FNLP-Template-Engine/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/antononcube","download_url":"https://codeload.github.com/antononcube/NLP-Template-Engine/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246886494,"owners_count":20849873,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-15T14:16:43.466Z","updated_at":"2026-01-16T00:58:22.624Z","avatar_url":"https://github.com/antononcube.png","language":"Mathematica","funding_links":[],"categories":[],"sub_categories":[],"readme":"# NLP Template Engine\n\n## In brief\n\nThis repository provides implementation, data, and documentation of a Natural Language Processing (NLP) \n[Template Engine (TE)](https://en.wikipedia.org/wiki/Template_processor), [Wk1], \nthat utilizes\n[Question Answering Systems (QAS')](https://en.wikipedia.org/wiki/Question_answering), [Wk2],\nand Machine Learning (ML) classifiers.\n\nThe current implementation of repository's NLP-TE is heavily based on the Wolfram Language (WL) \nbuilt-in function\n[`FindTextualAnswer`](https://reference.wolfram.com/language/ref/FindTextualAnswer.html), \n[WRI1].\n\nIn the future, we plan to utilize other -- both WL and non-WL -- QAS implementations.\n\n### Problem formulation\n\nWe want to have a system (i.e. TE) that:\n\n1. Generates relevant, correct, executable programming code based natural language specifications of computational workflows\n\n2. Can automatically recognize the workflow types\n\n3. Can generate code for different programming languages and related software packages\n\nThe points above are given in order of importance; the most important are placed first.\n\n------\n\n## Examples\n\nInstall the package \n[NLPTemplateEngine.m](https://github.com/antononcube/NLP-Template-Engine/blob/main/Packages/WL/NLPTemplateEngine.m)\nwith:\n\n```mathematica\nImport[\"https://raw.githubusercontent.com/antononcube/NLP-Template-Engine/main/Packages/WL/NLPTemplateEngine.m\"]\n```\n\n### Latent Semantic Analysis\n\n```mathematica\nlsaCommand = \"Extract 20 topics from the text corpus aAbstracts using\nthe method NNMF. Show statistical thesaurus with the words neural, function, and notebook\";\nlsaRes = Concretize[\"LSAMon\", lsaCommand]\n```\n\n```mathematica\nHold[lsaObj =\n    LSAMonUnit[aAbstracts]⟹\n    LSAMonMakeDocumentTermMatrix[\"StemmingRules\" -\u003e Automatic, \"StopWords\" -\u003e Automatic]⟹\n    LSAMonEchoDocumentTermMatrixStatistics[\"LogBase\" -\u003e 10]⟹\n    LSAMonApplyTermWeightFunctions[\"GlobalWeightFunction\" -\u003e \"IDF\", \"LocalWeightFunction\" -\u003e \"None\", \"NormalizerFunction\" -\u003e \"Cosine\"]⟹\n    LSAMonExtractTopics[\"NumberOfTopics\" -\u003e 20, Method -\u003e \"NNMF\", \"MaxSteps\" -\u003e 16, \"MinNumberOfDocumentsPerTerm\" -\u003e 20]⟹\n    LSAMonEchoTopicsTable[\"NumberOfTerms\" -\u003e 10]⟹\n    LSAMonEchoStatisticalThesaurus[\"Words\" -\u003e {\"neural\", \"function\", \"notebook\"}];\n]\n```\n\n### Quantile Regression\n\n```mathematica\nqrCommand = \n  \"Compute quantile regression with probabilities 0.4 and 0.6, with interpolation order 2, for the dataset dfTempBoston.\";\nlsaRes = Concretize[qrCommand]\n```\n\n```mathematica\nHold[\n qrObj = \n   QRMonUnit[dfTempBoston]⟹\n   QRMonEchoDataSummary[]⟹\n   QRMonQuantileRegression[12, {0.4, 0.6}, InterpolationOrder -\u003e 2]⟹\n   QRMonPlot[\"DateListPlot\" -\u003e False, PlotTheme -\u003e \"Detailed\"]⟹\n   QRMonErrorPlots[\"RelativeErrors\" -\u003e False, \"DateListPlot\" -\u003e False, PlotTheme -\u003e \"Detailed\"];\n]\n```\n\n### Random tabular data generation\n\n```mathematica\nrtdCommand =\n\"Create a random dataset with 30 rows, 8 columns, and 60 values using column names generator RandomWord.\";\nres = Concretize[\"RandomDataset\", rtdCommand]\n```\n\n```mathematica\nHold[ \n  ResourceFunction[\"RandomTabularDataset\"][{30, 8}, \n    \"ColumnNamesGenerator\" -\u003e RandomWord, \"Form\" -\u003e \"Wide\", \n    \"MaxNumberOfValues\" -\u003e 60, \"MinNumberOfValues\" -\u003e 60, \n    \"RowKeys\" -\u003e False]\n]\n```\n\n------\n\n## Interactive interface\n\nHere is an interactive interface that gives \"online\" access to the functionalities discussed: \n[\"DSL evaluations interface\"](https://antononcube.shinyapps.io/DSL-evaluations/).\n\n[![DSL-evaluations-interface-with-QR-spec-for-QAS](./Documents/Diagrams/General/DSL-evaluations-interface-with-QR-spec-for-QAS.png)](https://antononcube.shinyapps.io/DSL-evaluations/)\n\n(I order to try out repository's TE \"Question Answering System\" radio button have to selected.)\n\n------\n\n## How it works?\n\nThe following flowchart describes how the NLP Template Engine involves a series of steps for processing a computation specification and executing code to obtain results:\n\n```mermaid\nflowchart TD\n  spec[/Computation spec/] --\u003e workSpecQ{Is workflow type\u003cbr\u003especified?}\n  workSpecQ --\u003e |No| guess[[Guess relevant\u003cbr\u003eworkflow type]]\n  workSpecQ --\u003e|Yes| raw[Get raw answers]\n  guess -.- classifier{{Classifier:\u003cbr\u003etext to workflow type}}\n  guess --\u003e raw\n  raw --\u003e process[Process raw answers]\n  process --\u003e template[Complete\u003cbr\u003ecomputation\u003cbr\u003etemplate]\n  template --\u003e execute[/Executable code/]\n  execute --\u003e results[/Computation results/]\n\n  subgraph questionSystem[\"Question answering system\"]\n    neuralNet{{Neural network}} -.-\u003e find[[FindTextualAnswer]]\n  end\n\n  find --\u003e raw\n  raw --\u003e find\n  template -.- compData[(Computation\u003cbr\u003etemplates\u003cbr\u003edata)]\n  compData -.- process\n\n  classDef highlighted fill:Salmon,stroke:Coral,stroke-width:2px;\n  class spec,results highlighted\n```\n\nHere's the narrative for each component and the flow of processes:\n\n1. **Computation Spec**: The process begins with a computation specification, which is a plan or requirement for a computation task.\n\n2. **Workflow Type Decision**: The diagram queries whether the workflow type is specified. If the workflow type is not specified, the system proceeds to guess the relevant workflow type.\n\n3. **Guess Workflow Type**: If the workflow type needs to be guessed the type is then classified by a classifier which categorizes the text into a specific workflow type. \n   \n4. **Raw Answers**:\n   - Whether the workflow type is specified directly or guessed, the system proceeds to acquire raw answers.\n   - This node acts as a central point where either directly specified or guessed information leads to the collection of raw answers through an iterative process with `FindTextualAnswer` within the \"Question answering system\".\n\n5. **Question Answering System**:\n   - The \"Question answering system\" component has a neural network to process data, and its output is utilized to find textual answers.\n\n6. **Process Raw Answers**: The raw answers are then processed to extract or format the necessary information for further steps.\n\n7. **Complete Computation Template**:\n   - The processed data is used to complete a computation template, which is crucial for preparing the executable code.\n   - This template completion step accesses a database of computation templates data.\n\n8. **Executable Code and Results**:\n   - The completed template leads to the generation of executable code, which then *can be* run to produce computation results.\n\n------\n\n## Bring your own templates\n\n0. Load the NLP-Template-Engine\n   [WL package](./Packages/WL/NLPTemplateEngine.m):\n\n```mathematica\nImport[\"https://raw.githubusercontent.com/antononcube/NLP-Template-Engine/main/Packages/WL/NLPTemplateEngine.m\"]\n```\n\n1. Get the \"training\" templates data (from CSV file you have created or changed) for a new workflow\n   ([\"SendMail\"](./TemplateData/dsQASParameters-SendMail.csv)):\n\n```mathematica\ndsSendMailTemplateEngineData = ResourceFunction[\"ImportCSVToDataset\"][\"https://raw.githubusercontent.com/antononcube/NLP-Template-Engine/main/TemplateData/dsQASParameters-SendMail.csv\"];\nDimensions[dsSendMailTemplateEngineData]\n\n(* {43, 5} *)\n```\n\n2. Add the ingested data for the new workflow (from the CSV file) into the NLP-Template-Engine:\n\n```mathematica\nNLPTemplateEngineAddData[dsSendMailTemplateEngineData] // Keys\n\n(* {\"Questions\", \"Templates\", \"Defaults\", \"Shortcuts\"} *)\n```\n\n3. Parse natural language specification with the newly ingested and onboarded workflow (\"SendMail\"):\n\n```mathematica\nConcretize[\"SendMail\", \"Send email to joedoe@gmail.com with content RandomReal[343], and the subject this is a random real call.\", PerformanceGoal -\u003e \"Speed\"]\n\n(* Hold[\n SendMail[\n  Association[\"To\" -\u003e {\"joedoe@gmail.com\"}, \n   \"Subject\" -\u003e \"a random real call\", \"Body\" -\u003e RandomReal, \n   \"AttachedFiles\" -\u003e None]]] *)\n```\n\n4. Experiment with running the generated code!\n\n------\n\n## References\n\n### Articles\n\n[JL1] Jérôme Louradour,\n[\"New in the Wolfram Language: FindTextualAnswer\"](https://blog.wolfram.com/2018/02/15/new-in-the-wolfram-language-findtextualanswer/),\n(2018),\n[blog.wolfram.com](https://blog.wolfram.com).\n\n[Wk1] Wikipedia entry, [Template processor](https://en.wikipedia.org/wiki/Template_processor).\n\n[Wk2] Wikipedia entry, [Question answering](https://en.wikipedia.org/wiki/Question_answering).\n\n### Functions, packages, repositories\n\n[AAr1] Anton Antonov,\n[DSL::Translators Raku package](https://github.com/antononcube/Raku-DSL-Translators),\n(2020-2024),\n[GitHub/antononcube](https://github.com/antononcube).\n\n[ECE1] Edument Central Europe s.r.o.,\n[https://cro.services](https://cro.services).\n\n[WRI1] Wolfram Research,\n[FindTextualAnswer]( https://reference.wolfram.com/language/ref/FindTextualAnswer.html),\n(2018),\n[Wolfram Language function](https://reference.wolfram.com), (updated 2020).\n\n### Videos\n\n[AAv1] Anton Antonov,\n[\"NLP Template Engine, Part 1\"](https://youtu.be/a6PvmZnvF9I),\n(2021),\n[YouTube/@AAA4Prediction](https://www.youtube.com/@AAA4Prediction).\n\n[AAv2] Anton Antonov,\n[\"Natural Language Processing Template Engine\"](https://www.youtube.com/watch?v=IrIW9dB5sRM) presentation given at WTC-2022,\n(2023),\n[YouTube/@Wolfram](https://www.youtube.com/@Wolfram).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fantononcube%2Fnlp-template-engine","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fantononcube%2Fnlp-template-engine","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fantononcube%2Fnlp-template-engine/lists"}