{"id":16312481,"url":"https://github.com/kiranandcode/html_gen","last_synced_at":"2025-04-22T11:53:49.541Z","repository":{"id":112325962,"uuid":"151776569","full_name":"kiranandcode/html_gen","owner":"kiranandcode","description":"A simple HTML templating engine built using literate programming","archived":false,"fork":false,"pushed_at":"2018-10-22T12:48:22.000Z","size":385,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-02-16T14:25:10.862Z","etag":null,"topics":["html","literate-programming","rust","template-engine"],"latest_commit_sha":null,"homepage":null,"language":"TeX","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kiranandcode.png","metadata":{"files":{"readme":"readme.org","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-10-05T20:49:38.000Z","updated_at":"2022-10-11T00:23:32.000Z","dependencies_parsed_at":"2023-05-12T22:45:40.466Z","dependency_job_id":null,"html_url":"https://github.com/kiranandcode/html_gen","commit_stats":null,"previous_names":["kiranandcode/html_gen"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kiranandcode%2Fhtml_gen","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kiranandcode%2Fhtml_gen/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kiranandcode%2Fhtml_gen/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kiranandcode%2Fhtml_gen/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kiranandcode","download_url":"https://codeload.github.com/kiranandcode/html_gen/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250237808,"owners_count":21397399,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["html","literate-programming","rust","template-engine"],"created_at":"2024-10-10T21:48:12.918Z","updated_at":"2025-04-22T11:53:49.522Z","avatar_url":"https://github.com/kiranandcode.png","language":"TeX","funding_links":[],"categories":[],"sub_categories":[],"readme":"* HtmlGen\nDoes the following sound familliar to you?\n\nA: \"All I want is to template some files for a static site.\"\n\nB: \"Oh, okay, that's simply, just install npm, and packages yarn,gulp,react,express.........\"\n\nA: \"...\"\n\nIntroducing HTMLGEN, a simple little standalone static templating engine for HTML.\nCoded using literate programming.\n\nNow with 50 page book describing how to make your own HtmlGen!\n\nAvailable at: https://github.com/Gopiandcode/html_gen/raw/master/html_gen.pdf\n\n* Usage\nThe command line interface for the program is as follows.\n#+begin_src shell\nUsage:\n  htmlgen [OPTIONS] [BASEDIR]\n\nSimple templating engine for html documents\n\nPositional arguments:\n  BASEDIR               The project directory. If not specified, then --input,\n                        --template and --output flags must be given.\n\nOptional arguments:\n  -h,--help             Show this help message and exit\n  -o,--output OUTPUT    Directory for the output files to be saved. It defaults\n                        to BASEDIR/bin\n  -t,--template TEMPLATE\n                        Directory to be searched to find templates. It defaults\n                        to BASEDIR/template\n  -i,--input INPUT      Directory in which the source files to be compiled are\n                        located. It defaults to BASEDIR/bin\n  -e,--error ERROR      Fail on the first undefined parameter\n  -d,--default DEFAULT  Additional mapping for storing default values. If not \n                        specified, the environment variable GOP_HTML_DEFAULTS\n                        if defined, is used as a default.\n#+end_src\n\nTo use the program, first write a templated html fiile, with '{NAME}' for parameters:\n#+begin_src html \n\u003chtml\u003e\n\u003chead\u003e\n\u003ctitle\u003e{title}\u003c/title\u003e\n{style}\n\u003c/head\u003e\n\u003cbody\u003e\n\u003ch1\u003e{title}\u003c/h1\u003e\n\u003cp\u003e{body}\u003c/p\u003e\n\u003c/body\u003e\n\u003c/html\u003e\n#+end_src\nSave this in the template directory of your project - we'll store it in './template/blog/simple.gop'.\n\nThen construct a mapping file in the '.gop' format (whitespace insensitive):\n#+begin_src \n#+template blog/simple.gop\ntitle: A statically generated site without any JS¬\nbody: \n    I built this site \u003cb\u003ewithout\u003c/b\u003e any JS baby.\n    I can put divs \u003cdiv\u003ewithin this templated content\u003c/div\u003e\n    without issue¬\nstyle:\n\u003clink ref=\"\" name=\"style\" /\u003e¬\n#+end_src\n\nThen we can run the application as \"html_gen .\"; this will produce the following html:\n#+begin_src html\n\u003chtml\u003e\n\u003chead\u003e\n\u003ctitle\u003eA statically generated site without any JS\u003c/title\u003e\n   \u003clink ref=\"\" name=\"style\" /\u003e\n\u003c/head\u003e\n\u003cbody\u003e\n\u003ch1\u003eA statically generated site without any JS\u003c/h1\u003e\n\u003cp\u003e\n    I built this site \u003cb\u003ewithout\u003c/b\u003e any JS baby.\n    I can put divs \u003cdiv\u003ewithin this templated content\u003c/div\u003e\n    without issue\u003c/p\u003e\n\u003c/body\u003e\n\u003c/html\u003e\n#+end_src\n\n* Preamble\nWhile the overall nature of this project is quite simple - just a bit of file loading and exports, we can leverage rust's ecosystem to make our development a little easier.\n\n** Crates\nThe crates we'll be using are as follows:\n- *ArgParse* - This is a crate that I used a while back when making another command line application. It provides a very nice rustic interface over a library which produces command line interfaces compliant with most unix/linux standards.\nan old fashioned regex.\n#+begin_src rust :tangle src/main.rs  :comments org\nextern crate argparse;\n#+end_src\n\n- *Regex* - We'll be taking advantage of this regex crate to make the parsing phase a little easier; while the stdlib provides some pretty useful string matching utilities, they don't quite match up to\n#+begin_src rust :tangle src/main.rs  :comments org\nextern crate regex;\n#+end_src\n\n** Standard Library Imports\nWe'll be using the path utilities provided by the standard library to help us navigate the filesystem in a cross platform way.\n#+begin_src rust :tangle src/main.rs :comments org\nuse std::env;\nuse std::path;\nuse std::fs::File;\nuse std::io::Read;\nuse std::path::Path;\nuse std::io::Write;\n#+end_src\n** Module structure\nWe'll be splitting up our codebase as follows:\n\n#+begin_src rust :tangle src/main.rs :noweb yes :comments org\n\u003c\u003cmodules\u003e\u003e\n#+end_src\n\n* Command Line Interface\nClearly this project is going to be a command line application, as the static generator will need to parse a document and construct the components.\n\nUsing argparse - as imported in the preamble, we'll design a sweet and sexy interface to access our application. The main actions we'll allow a user to perform using this application will be as follows:\n- *specify output folder* - by default the output of the compiled files are placed in ~./bin/~ dir, which is made if it does not exist.\n- *specify template folder* - within a non-templated file, when a template reference is used, by default the application searches the \n ~./template/~ dir to resolve these references.\n- *specify input folder* - by default the program searches ~./src/~ for the source files to be compiled\n\n#+begin_src rust :tangle src/main.rs :comments org :noweb yes\nfn main() {\n \u003c\u003chigh level interface\u003e\u003e\n}\n#+end_src\n\nWe'll set up some initial variables to hold the parameters from the command line.\n#+name: high level interface\n#+begin_src rust :comments noweb\nlet mut output_path = String::from(\"\");\nlet mut template_path = String::from(\"\");\nlet mut input_path = String::from(\"\");\nlet mut base_dir : Option\u003cString\u003e = None;\n#+end_src\n\nWe'll also need to setup an error strategy - this will require some additional data structures, so we'll leave it to the end.\n#+name: high level interface\n#+begin_src rust :comments noweb :noweb yes\n\u003c\u003chigh level error strategy\u003e\u003e\n#+end_src\n\n\n\nUsing argparse, we can implement this cmdline interface as follows:\n#+name: high level interface\n#+begin_src rust :comments noweb :noweb yes\n    let mut help_string : Vec\u003cu8\u003e = Vec::new();\n    {\n        let mut ap = argparse::ArgumentParser::new();\n        ap.set_description(\"Simple templating engine for html documents\");\n        ap.refer(\u0026mut output_path)\n        .add_option(\u0026[\"-o\",\"--output\"], \n                    argparse::Store, \n                    \"Directory for the output files to be saved. It defaults to BASEDIR/bin\");\n\n        ap.refer(\u0026mut template_path)\n        .add_option(\u0026[\"-t\",\"--template\"], \n                    argparse::Store, \n                    \"Directory to be searched to find templates. It defaults to BASEDIR/template\");\n\n        ap.refer(\u0026mut input_path)\n        .add_option(\u0026[\"-i\",\"--input\"], \n                    argparse::Store, \n                    \"Directory in which the source files to be compiled are located. It defaults to BASEDIR/bin\");\n        \n        ap.refer(\u0026mut base_dir)\n        .add_argument(\"BASEDIR\", \n              argparse::StoreOption, \n              \"The project directory. If not specified, then --input, --template and --output flags must be given. \");\n\n        \u003c\u003chigh level error args\u003e\u003e\n        \n        ap.print_help(\"htmlgen\", \u0026mut help_string);\n\n        ap.parse_args_or_exit();\n    }\n#+end_src\n\nBefore we do anything, let's get a copy of the help string generated by ~argparse~ for the program.\n#+name: high level interface\n#+begin_src rust :comments noweb :noweb yes\nlet help_string = unsafe { String::from_utf8_unchecked(help_string) };\n#+end_src\n\nAdditionally, we'll convert the unwritten values to options.\n#+name: high level interface\n#+begin_src rust :comments noweb :noweb yes\nlet mut output_path = if output_path.is_empty() { None } else { Some(output_path) };\nlet mut template_path = if template_path.is_empty() { None } else { Some(template_path) };\nlet mut input_path = if input_path.is_empty() { None } else { Some(input_path) };\n#+end_src\n\nFollowing this, we  do some error checking to ensure that everything is suitably specified.\nIf the base directory is not specified, then all other parameters must be specified - otherwise we exit.\n#+name: high level interface\n#+begin_src rust :comments noweb :noweb yes\nif base_dir.is_none() \u0026\u0026 (output_path.is_none() || template_path.is_none() || input_path.is_none()) {\n   println!(\"{}\", help_string);\n   ::std::process::exit(-1);\n}\n#+end_src\n\nWith that done, we can safely extract the paths.\nAs specified, our output and template paths take default values from the supplied ~BASEDIR~.\n#+name: high level interface\n#+begin_src rust :comments noweb :noweb yes\nlet (output_path, template_path, input_path) = if let Some(bd) = base_dir {\n    let bd = Path::new(\u0026bd);\n    let error_string = format!(\"{:?} is not a valid path\", bd);\n    let alt_output_path = bd.join(Path::new(\u0026\"bin\")).to_str().expect(\u0026error_string).to_owned();\n    let alt_template_path = bd.join(Path::new(\u0026\"template\")).to_str().expect(\u0026error_string).to_owned();\n    let alt_input_path = bd.join(Path::new(\u0026\"src\")).to_str().expect(\u0026error_string).to_owned();\n\n    let output_path = output_path.unwrap_or_else(|| alt_output_path );\n    let template_path = template_path.unwrap_or_else(|| alt_template_path );\n    let input_path = input_path.unwrap_or_else(|| alt_input_path );\n\n    (output_path, template_path, input_path)\n} else {\n    (output_path.unwrap(), template_path.unwrap(), input_path.unwrap())\n};\n#+end_src\n\n* Core Logic\nNow we've obtained the directory for the files to be stored, we can move on to the main logic of the program.\nFundamentaly the logic of this program can be split into two main components:\n - Recursively descending the source directory, keeping track of the file structure.\n#+name: modules \n#+begin_src rust :comments noweb\nmod crawler;\n#+end_src\n - Extracting the data from a given file\n#+name: modules \n#+begin_src rust :comments noweb\nmod parser;\n#+end_src\n - generate a compiled html file from the template and save it to a folder\n#+name: modules\n#+begin_src rust :comments noweb\nmod generator;\n#+end_src \n\n\n\n#+name: high level interface\n#+begin_src rust :comments noweb\nlet output_directory = Path::new(\u0026output_path);\nlet input_directory = Path::new(\u0026input_path);\nlet template_directory = Path::new(\u0026template_path);\n#+end_src\n\nThus the high level execution of the system is as follows.\nFirst we update the error strategy.\n#+name: high level interface\n#+begin_src rust :comments noweb :noweb yes\n\u003c\u003chigh level error update\u003e\u003e\n#+end_src\n\nThen we run the crawler and print the output. Done.\n#+name: high level interface\n#+begin_src rust :comments noweb\nprintln!(\"{:?}\", crawler::crawl_directories(\u0026output_directory, \u0026input_directory, \u0026template_directory, \u0026err_strat));\n#+end_src\n\n\n\n** Parser Logic\nBefore we begin, we'll need the following packages in our parser:\n#+begin_src rust :tangle src/parser.rs :noweb yes :comments org\nuse std::collections::HashMap;\nuse regex::Regex;\n\u003c\u003cstructures\u003e\u003e\n#+end_src\nOnce again, our core specification for the parser is to extract a set of key value pairs. Our syntax will be of the following form:\n#+begin_src \nID := (Sigma/{:, (, )})+\nINTRO := #+template: Sigma+\\n\nMAPPING := ID:  ((SIGMA/{¬})|\\¬)* ¬\nDOCUMENT := INTRO MAPPING*\n#+end_src\nOur parser will take in a string (the contents of the file), and return either a hashmap of values and a template name, or an error.\n#+begin_src rust :tangle src/parser.rs :noweb yes :comments org\n\u003c\u003csource parsing utility functions\u003e\u003e\n\npub fn parse_source_string(source: \u0026str) \n   -\u003e Result\u003c(String, HashMap\u003cString,String\u003e),ParseError\u003e {\n\u003c\u003csource parsing regexes\u003e\u003e\n\u003c\u003csource parsing code\u003e\u003e\n}\n\n#[cfg(test)]\nmod test {\n   use super::*;\n\n  \u003c\u003csource parsing tests\u003e\u003e\n}\n#+end_src\nWhere a parsing error will be one of the following:\n - **Template not found** - if the source file does not specify a template to be loaded\n - **Invalid identifier** - if an identifier contains an invalid character.\n - **Unterminated Body** - if a body does not have a valid terminator.\n#+name: structures\n#+begin_src rust :comments noweb\n#[derive(Debug)]\npub enum ParseError {\n   TemplateNotFound,\n   InvalidIdentifier,\n   UnterminatedBody\n}\n#+end_src\nFor simplicity, we're making the parser as general as possible and opting to make failure as unlikely as possible.\n\nTo do the parsing, first we start off by consuming the template directive, and failing if not present.\n\nFirst, we check that the template contains a template directive - we're leaving resolving the template to a file to a later point.\n#+name: source parsing code\n#+begin_src rust :comments noweb\nif !source.trim_left().starts_with(\"#+template:\") {\n   return Err(ParseError::TemplateNotFound);\n}\n#+end_src\n\nThis means that if a source does not start with a directive, its parsing will fail:\n#+name: source parsing tests\n#+begin_src rust :comments noweb\n#[test]\nfn must_start_with_template_directive() {\n   assert!(parse_source_string(\"temp-justkidding\\n id:\\n #+template:\\n\").is_err());\n}\n#+end_src\n\nAfter this check, we can safetly consume the first part of the string.\n#+name: source parsing code\n#+begin_src rust  :comments noweb\nlet source = source.trim_left().split_at(11).1;\n#+end_src\n\nNext, let's retrieve the actual template name - failing if it was not provided.\n#+name: source parsing code\n#+begin_src rust :comments noweb\nlet (raw_template_name, remaining_string) = split_at_pattern(source, \"\\n\");\nlet template_name = raw_template_name.trim();\nif template_name.is_empty() {\n   return Err(ParseError::TemplateNotFound);\n}\n#+end_src\n\nThis also means that if a source does not provide a template name its parsing will fail:\n#+name: source parsing tests\n#+begin_src rust :comments noweb\n#[test]\nfn must_provide_template_name() {\n    assert!(parse_source_string(\"#+template: example\\n\").is_ok());\n    assert!(parse_source_string(\"#+template:\\n\").is_err());\n    assert!(parse_source_string(\"#+template:    \\n\").is_err());\n    assert!(parse_source_string(\"#+template:   \\n  \\n\").is_err());\n    assert!(parse_source_string(\"#+template:   \\t  \\n\").is_err());\n}\n#+end_src\n\n\nNow, our remaining task is to simply iterate through the remaining ~ID: DATA~ pairs, and accumulate these values into a hashmap - let's begin\nby setting up an initial hashmap to store the files.\n#+name: source parsing code\n#+begin_src rust :comments noweb\nlet mut data : HashMap\u003cString, String\u003e = HashMap::new();\n#+end_src\nNext, we'll define a simple loop to do the accumulation - it will use a reference to the hashmap, and the source:\n#+name: source parsing code\n#+begin_src rust :comments noweb :noweb yes\nlet mut completed = false;\nlet mut source = remaining_string;\nlet mut data = data;\n\nwhile !completed {\n   \u003c\u003csource pairs loop\u003e\u003e\n}\n#+end_src\nTo extract the keys and bodies, we'll be using a regex - it checks that the start of the string consists of non terminator characters,\nfollowed by a colon.\n#+name: source parsing regexes\n#+begin_src rust :comments noweb :noweb yes\nlet key_regex = Regex::new(\"^[^¬:{}\\\\\\\\]*:\").unwrap();\n#+end_src\n\nNow, inside the loop, we'll use the regex to extract the key values - for this purpose, we'll define a custom ~split_by_regex~ function,\nwhich operates like the ~split_at_pattern~ function but uses the first match of a regex to split the input.\n\n#+name: source parsing utility functions\n#+begin_src rust :comments noweb\nfn split_at_regex\u003c'a\u003e(string: \u0026'a str, pat: \u0026Regex) -\u003e (\u0026'a str, \u0026'a str) {\n  if let Some(m) = pat.find(string) {\n     string.split_at(m.end())\n  } else {\n     (\u0026\"\", string)\n  }\n}\n#+end_src\nNow, using this function, we can implement the key extraction.\n\n#+name: source pairs loop\n#+begin_src rust :comments noweb\nlet (raw_key_name, remaining_string) = split_at_regex(source, \u0026key_regex);\nlet key_name = raw_key_name.trim();\nsource = remaining_string;\n#+end_src\n\nNow due to the way we're extracting the values, bad input may lead to an incorrect parse - we'll try and avoid this by printing an error when the IDs are wrong:\n#+name: source pairs loop\n#+begin_src rust :comments noweb\nif key_name.len() == 0 {\n  eprintln!(\"Invalid parse, found empty/malformed ID tag\");\n  return Err(ParseError::InvalidIdentifier);\n}\n#+end_src\nDue to the way we extract the ids, we also end up bringing the colon as well. Let's just remove it before proceeding:\n#+name: source pairs loop\n#+begin_src rust :comments noweb\nlet mut key_name = key_name.to_string();\nkey_name.pop();\nlet key_name = key_name.trim();\n#+end_src\n\nNow we can move on to extracting the data. Let's start by defining a regular expression to isolate specific syntax we wish to capture.\n#+name: source parsing regexes\n#+begin_src rust :comments noweb\nlet data_regex = Regex::new(\"^(\\\\\\\\¬|([^¬\\\\\\\\]|\\\\\\\\[^¬])*)*¬\").unwrap();\n#+end_src\n\nThe regex we're using can be explained as follows; the outermost kleene closure captures the main constraint that the data should start from the start of the string and end at the first occurrance\nof a terminating character.\n#+begin_src regex\n^ INTERNALS *¬\n#+end_src\n\nNext, for the contents of a body, we have to capture 2 main cases:\n- When the character is normal and non interesting\n- When the character is an escaped terminator.\n#+begin_src regex\nINTERNALS ::= (ESCAPED_TERMINATOR|NORMAL_CHARACTERS)\n#+end_src\n\nFor the escaped terminator case, we simply match on a backspace followed by a terminator.\n#+begin_src regex\nESCAPED_TERMINATOR = \\¬\n#+end_src\n\nIn the case of normal characters, either \n- the character is neither a backslash or a terminator\n- the character is a backslash and is followed by anything other than a terminator\n#+begin_src regex\nNORMAL_CHARACTERS = ([^¬\\\\\\\\]|\\\\\\\\[^¬])*\n#+end_src\n\nUsing this regex we can trivially extract the data, repeating the code for key extraction.\n#+name: source pairs loop\n#+begin_src rust :comments noweb\nlet (raw_data, remaining_string) = split_at_regex(source, \u0026data_regex);\nlet src_data = raw_data.trim();\nsource = remaining_string;\n#+end_src\n\nWhile it is fine for data to be empty, we always require the user to provide the end character, so the string should never be 0.\n#+name: source pairs loop\n#+begin_src rust :comments noweb\nif src_data.len() == 0 {\n  eprintln!(\"Invalid parse, found body with no terminating tag.\");\n  return Err(ParseError::UnterminatedBody);\n}\n#+end_src\n\nNow, as before, let's remove the terminating character.\n#+name: source pairs loop\n#+begin_src rust :comments noweb\nlet mut src_data = src_data.to_string();\nsrc_data.pop();\nlet src_data = src_data.trim();\n#+end_src\n\nFinally, now we've extracted the id and the tag, we can simply put the values into our hashmap.\n#+name: source pairs loop\n#+begin_src rust :comments noweb\ndata.insert(key_name.to_string(), src_data.to_string());\n#+end_src\n\nNow, we also need to check for a terminating condition - we'll do this by checking if the remaining string, when trimmed, is empty.\n#+name: source pairs loop\n#+begin_src rust :comments noweb\nif source.trim().is_empty() {\n    break;\n}\n#+end_src\n\nFinally, now that string has been consumed, we can simply return the template name and the populated hashmap.\n\n#+name: source parsing code\n#+begin_src rust :comments noweb :noweb yes\nOk((template_name.to_string(), data))\n#+end_src\n\nAside: Notice, that during the parsing, we're using our own custom function to allow us to split by a pattern, a feature the\nstdlib doesn't seem to provide.\n\nThis utility function splits a string by the first occurance of a pattern returning a string up to the first occurrance \nof the pattern and a string continuing from the pattern - the second string contains the text matching the pattern.\n#+name: source parsing utility functions\n#+begin_src rust :comments noweb\nfn split_at_pattern\u003c'a\u003e(string: \u0026'a str, pat: \u0026str) -\u003e (\u0026'a str, \u0026'a str) {\n  if let Some(ind) = string.find(pat) {\n     string.split_at(ind)\n  } else {\n     (\u0026\"\", string)\n  }\n}\n#+end_src\n\n** Generator Logic\nThe generator takes in an input templated string and an associated mapping and returns a string in which the templates have been filled - it also takes in a paramter dictating how to respond to ill formed strings.\n\nWe'll be importing the following libraries to make this thing work.\n#+name: generator imports\n#+begin_src rust :comments org\nuse std::collections::HashMap;\nuse regex::{Regex, Captures};\n#+end_src\n\nThe generator module follows the standard pattern.\n#+begin_src rust :tangle src/generator.rs :noweb yes :comments org\n\u003c\u003cgenerator imports\u003e\u003e\n\u003c\u003cgenerator structures\u003e\u003e\n\u003c\u003cgenerator utilities\u003e\u003e\n\u003c\u003cgenerator function\u003e\u003e\n\n#[cfg(test)]\nmod tests {\n   use super::*;\n\n   \u003c\u003cgenerator tests\u003e\u003e\n}\n#+end_src\n\n\nThe main utility provided by the generator is the main function that populates the templated string when given a mapping, additionally we must specify how the generator should respond when missing templates are found.\n#+name: generator function\n#+begin_src rust :comments org :noweb yes\npub fn generate_output(input: String, mapping: HashMap\u003cString, String\u003e, fail_response: \u0026GeneratorErrorStrategy) -\u003e Result\u003cString, GeneratorError\u003e {\n \u003c\u003cgenerator logic\u003e\u003e\n}\n#+end_src\n\nThe strategies the generator should accept are:\n- *Fail* - Error out if a parameter that is not in the mapping is found in the template; this is the default.\n- *Ignore* - ignore any missing parameters.\n- *Fixed* - replace any missing parameters with a fixed response\n- *Default* - try a default mapping for the keyword, otherwise try one of the other strategies.\nTo implement this, we'll use two structures, one to represent the non-recursive cases, and the other for the default option.\n#+name: generator structures\n#+begin_src rust :comments org \n#[derive(Clone,Debug,PartialEq)]\npub enum GeneratorErrorCoreStrategy {\n   Fail,\n   Ignore,\n   Fixed(String)\n}\n#+end_src\n\nThus for the full enum, we can avoid having to mess with boxes.\n#+name: generator structures\n#+begin_src rust :comments org \npub enum GeneratorErrorStrategy {\n   Base(GeneratorErrorCoreStrategy),\n   Default(HashMap\u003cString,String\u003e, GeneratorErrorCoreStrategy)\n}\n#+end_src\n\nNow, the errors the templating function can return are partially based on the error response strategies.\n- *Undefined Parameter* - An error when a paremeter with no mapping is found, and the strategy is sufficiently strict.\n#+name: generator structures\n#+begin_src rust :comments org\n#[derive(Debug)]\npub enum GeneratorError {\n  UndefinedParameter\n}\n#+end_src \n\n\nThe core logic of the generator is to use capture groups capabilities provided by the regex crate.\n\nWe'll reuse the same pattern as used in the parser, but wrap it in braces and capture the contents.\n#+name: generator logic\n#+begin_src rust :comments org\nlet parameter_regex = Regex::new(\"\\\\{([^¬:{}\\\\\\\\]*)\\\\}\").unwrap();\n#+end_src \n\nBefore we run the regex, we'll need to set up some variables to capture lookup errors.\n#+name: generator logic\n#+begin_src rust :comments org\nlet mut lookup_failed = false;\n#+end_src \n\n\nNext, we'll run the regex on the input string.\n#+name: generator logic\n#+begin_src rust :comments org :noweb yes\nlet new_string = parameter_regex.replace_all(\u0026input, |caps: \u0026Captures| {\n   \u003c\u003cgenerator replacement logic\u003e\u003e\n});\n#+end_src\n\nIf a lookup failed, then we'll return an error.\n#+name: generator logic\n#+begin_src rust :comments org \nif lookup_failed {\n   return Err(GeneratorError::UndefinedParameter);\n}\n#+end_src\n\n\nOnce that's done we have the result string - it's a ~Cow\u003cstr\u003e~ though, so we just need to do a conversion before returning it.\n#+name: generator logic\n#+begin_src rust :comments org\nOk(new_string.to_string())\n#+end_src\n\nAll that's left is to define the replacement logic - if it matches, we can simply return the value stored in the hashmap. \n#+name: generator replacement logic\n#+begin_src rust :comments org :noweb yes\nif let Some(value) = mapping.get(\u0026caps[1]) {\n   value\n} else {\n   \u003c\u003cgenerator lookup failed\u003e\u003e  \n}\n#+end_src\n\nIf the lookup failes, our action depends on the error strategy we've chosen.\n#+name: generator lookup failed\n#+begin_src rust :comments org :noweb yes\nmatch \u0026fail_response {\n    GeneratorErrorStrategy::Base(strategy) =\u003e {\n        \u003c\u003cgenerator base strategy match\u003e\u003e\n    }\n    GeneratorErrorStrategy::Default(mapping, strategy) =\u003e {\n        \u003c\u003cgenerator default strategy\u003e\u003e\n    }\n}\n#+end_src\n\nFor the base case, we simply match on the specific strategy chosen to decide our action.\n#+name: generator base strategy match\n#+begin_src rust :comments org :noweb yes \nmatch strategy {\n  GeneratorErrorCoreStrategy::Fail =\u003e {\n      \u003c\u003cgenerator strategy fail case\u003e\u003e\n  }\n  GeneratorErrorCoreStrategy::Ignore =\u003e {\n      \u003c\u003cgenerator strategy ignore case\u003e\u003e\n  },\n  GeneratorErrorCoreStrategy::Fixed(text) =\u003e {\n      \u003c\u003cgenerators strategy fixed case\u003e\u003e\n  }\n}\n#+end_src\n\nIf the strategy is a fail fast case, then we still return an empty string, but we set the lookup failed\nerror, thereby ensuring that the result of the call is an error.\n#+name: generator strategy fail case\n#+begin_src rust :comments org\nlookup_failed = true;\n\"\"\n#+end_src\n\nIf the strategy is an ignore case, we simply leave the parameter as it was.  \n#+name: generator strategy ignore case\n#+begin_src rust :comments org\n\u0026caps[0]\n#+end_src\n\nFor the fixed case, we just return the fixed string.\n#+name: generators strategy fixed case\n#+begin_src rust :comments org\ntext\n#+end_src\n\n\nNow, for the default mapping case, we first check if the default mapping contains a value for the \nparameter. If it does, we can simply return that value.\n#+name: generator default strategy\n#+begin_src rust :comments org :noweb yes\nif let Some(value) = mapping.get(\u0026caps[1]) {\n   value\n} else {\n   \u003c\u003cgenerator default fail strategy\u003e\u003e  \n}\n#+end_src\n\n\nIf it doesn't, we simply match on the error strategy as previous.\n#+name: generator default fail strategy\n#+begin_src rust :comments org :noweb yes\n\u003c\u003cgenerator base strategy match\u003e\u003e\n#+end_src\n\n** Crawler Logic\nThe core logic for the crawler is to descend the input directory, keeping track of the current path, pass each file through the parser, then pass on the generated mapping to the generator, along with a corresponding template file and output file.\n\nWe'll be importing the following libraries for doing the core logic.\n#+name: crawler imports\n#+begin_src rust :comments org\nuse std::fs;\nuse std::io::Read;\nuse std::fs::File;\nuse std::path::Path;\nuse std::convert::AsRef;\n#+end_src\n\nWe'll also be bringing in the parsing function from the parser, and the generator function from the generator.\n#+name: crawler imports\n#+begin_src rust :comments org\nuse parser::{parse_source_string,ParseError};\nuse generator::{generate_output, GeneratorError, GeneratorErrorStrategy};\n#+end_src\n\nThe main structure for the crawler is as follows.\n#+begin_src rust :tangle src/crawler.rs :noweb yes :comments org\n\u003c\u003ccrawler imports\u003e\u003e\n\n\u003c\u003ccrawler structures\u003e\u003e\n\n\u003c\u003ccrawler function\u003e\u003e\n#+end_src\n\nOur crawling function, takes as input the input directory, the output directory, the template directory and the error strategy for the generator.\n#+name: crawler function\n#+begin_src rust :noweb yes :comments org\npub fn crawl_directories\u003cP,Q,R\u003e(\n    output_directory: \u0026P, \n    input_directory: \u0026Q, \n    template_path: \u0026R, \n    err_strat: \u0026GeneratorErrorStrategy\n) -\u003e Result\u003cu32,CrawlError\u003e \n where P : AsRef\u003cPath\u003e,\n       Q : AsRef\u003cPath\u003e,\n       R : AsRef\u003cPath\u003e {\n\u003c\u003ccrawler main logic\u003e\u003e\n}\n#+end_src\n\nThe errors produced by the crawler are as follows.\n- *ParseError* - When a parser occurs\n- *GeneratorError* - when a generator occurs\n- *TemplateNotFound* - When a template is not found\n- *InputDirectoryError* - When the input directory does not exist\n- *OutputDirectoryError* - When the output directory does not exist\n#+name: crawler structures\n#+begin_src rust :noweb yes :comments org\n#[derive(Debug)]\npub enum CrawlError {\n  ParseError(ParseError),\n  GeneratorError(GeneratorError),\n  TemplateNotFound(String),\n  InputDirectoryError,\n  OutputFileError(String),\n  InputFileError(String),\n}\n#+end_src\n\nBefore we begin, let's set up a counter to enumerate the number of files converted.\n#+name: crawler main logic\n#+begin_src rust :noweb yes :comments org\nlet mut file_count = 0;\n#+end_src\n\nFirst, we'll extract all the files in the input directory.\n#+name: crawler main logic\n#+begin_src rust :noweb yes :comments org\nlet input_files = input_directory.as_ref()\n                  .read_dir()\n                  .map_err(|_| \n                        CrawlError::InputDirectoryError\n                  )?;\nfor input_file in input_files {\n   \u003c\u003ccrawler file logic\u003e\u003e\n}\n#+end_src\n\nFor each file, we need to check its metadata.\n#+name: crawler file logic\n#+begin_src rust :noweb yes :comments org\nlet input_file = input_file.map_err(|e| CrawlError::InputFileError(format!(\"{:?}\", e)))?;\nlet input_metadata = input_file.metadata().map_err(|e| CrawlError::InputFileError(format!(\"{:?}\", e)))?;\nlet input_file_name = input_file.file_name();\nlet input_file_path = input_file.path();\nlet input_file_extension = input_file_path.extension().and_then(|ext| ext.to_str());\nlet input_file_base = input_file_path.file_stem().and_then(|stem| stem.to_str());\n#+end_src\n\nNow our next action is dependent on the type of entry - we'll need to do different things based on whether we find a file or a directory.\n#+name: crawler file logic\n#+begin_src rust :noweb yes :comments org\nif input_metadata.is_dir() {\n    \u003c\u003ccrawler directory logic\u003e\u003e\n} else if input_metadata.is_file() \u0026\u0026 (input_file_extension == Some(\"gop\")) \u0026\u0026 (input_file_base.is_some()) {\n    \u003c\u003ccrawler input file logic\u003e\u003e\n} else {\n   eprintln!(\"WARN: Encountered a non-template file (or non unicode path) during crawling the input directory {:?}\", input_file);\n}\n#+end_src\n\nNow, if the file is a directory, we do a recursive call, appending the directory name to the input path and output path \n#+name: crawler directory logic\n#+begin_src rust :noweb yes :comments org\nlet dir_name = Path::new(\u0026input_file_name);\nlet new_output_dir = output_directory\n                     .as_ref()\n                     .join(\u0026dir_name);\nlet new_input_dir = input_directory\n                    .as_ref()\n                    .join(\u0026dir_name);\n\n#+end_src\n\nLet's also make sure the new output directory actually exists to prevent any issues.\n#+name: crawler directory logic\n#+begin_src rust :noweb yes :comments org\nfs::create_dir_all(\u0026new_output_dir);\n#+end_src\n\nThen we can do the recursive step.\n#+name: crawler directory logic\n#+begin_src rust :noweb yes :comments org\nlet n_count = crawl_directories(\n    \u0026new_output_dir, \n    \u0026new_input_dir, \n    template_path, \n    err_strat\n)?;\nfile_count += n_count;\n#+end_src\n\n\nOn the other hand, if the file is just a file, we first need to read the file.\n#+name: crawler input file logic\n#+begin_src rust :noweb yes :comments org\nlet input_text = {\n   let mut temp = String::new();\n   let mut file = File::open(input_file.path()).map_err(|e| CrawlError::InputFileError(format!(\"{:?}\", e)))?;\n   file.read_to_string(\u0026mut temp).map_err(|e| CrawlError::InputFileError(format!(\"{:?}\", e)))?;\n   temp\n};\n#+end_src\n\nNow we'll run the parser on this text.\n#+name: crawler input file logic\n#+begin_src rust :noweb yes :comments org\nlet (template_name, mapping) = parse_source_string(\u0026input_text).map_err(|e| CrawlError::ParseError(e))?;\n#+end_src\n\nNow we need to read the template to a string.\n#+name: crawler input file logic\n#+begin_src rust :noweb yes :comments org\nlet template_path = template_path.as_ref().join(\u0026Path::new(\u0026template_name));\nlet template_text = {\n   let mut temp = String::new();\n   let mut file = File::open(template_path).map_err(|e| CrawlError::TemplateNotFound(format!(\"{:?}\", e)))?;\n   file.read_to_string(\u0026mut temp).map_err(|e| CrawlError::TemplateNotFound(format!(\"{:?}\", e)))?;\n   temp\n};\n#+end_src\n\nWith the template and the mapping, we can run the generator.\n#+name: crawler input file logic\n#+begin_src rust :noweb yes :comments org\nlet result = generate_output(\n   template_text, \n   mapping, \n   err_strat\n).map_err(|e| CrawlError::GeneratorError(e))?;\n#+end_src\n\nBefore we write this to the output directory, we need to construct a new name for the file.\n#+name: crawler input file logic\n#+begin_src rust :noweb yes :comments org\nlet input_file_base = input_file_base.unwrap();\nlet mut new_file_name = String::from(input_file_base);\nnew_file_name.push_str(\".html\");\n#+end_src\n\nFinally, we can write this to the output directory.\n#+name: crawler input file logic\n#+begin_src rust :noweb yes :comments org\nlet output_path = \n    output_directory.as_ref().join(\u0026Path::new(\u0026new_file_name));\nfs::write(\u0026output_path, result)\n    .map_err(|e| CrawlError::OutputFileError(format!(\"{:?}\", e)))?;\nfile_count += 1;\n#+end_src\n\n\n\n#+name: crawler main logic\n#+begin_src rust :noweb yes :comments org\nOk(file_count)\n#+end_src\n\n* Error Strategy\nNow for the final part of the application - implementing the error strategy from before.\n\nBefore we do anything, we'll need to extend the capabilities of a prior structure - specifically the GeneratorErrorCoreStrategy, and \nthe capability to parse the element from a string.\n#+name: generator structures\n#+begin_src rust :comments org :noweb yes\nimpl FromStr for GeneratorErrorCoreStrategy {\n    type Err = ();\n    fn from_str(src: \u0026str) -\u003e Result\u003cGeneratorErrorCoreStrategy, ()\u003e {\n        return match src {\n            \"fail\" =\u003e Ok(GeneratorErrorCoreStrategy::Fail),\n            \"ignore\" =\u003e Ok(GeneratorErrorCoreStrategy::Ignore),\n            x =\u003e {\n                 if let Some(ind) = src.find(\"=\") {\n                    if ind + 1 \u003c src.len() {\n                        let (txt, rem) = src.split_at(ind+1);\n                        if txt == \"fixed=\" {\n                            Ok(GeneratorErrorCoreStrategy::Fixed(rem.to_string()))\n                        } else {\n                            Err(())\n                        }\n                    } else {\n                        Err(())\n                    }\n                 } else {\n                   Err(())\n                 }\n            },\n        };\n    }\n}\n#+end_src\n\nAs you can see, we're referencing the ~FromStr~ trait which we'll need to import.\n#+name: generator imports\n#+begin_src rust :comments org\nuse std::str::FromStr;\n#+end_src\n\n\nNow let's just quickly add some tests to verify this actually works.\n#+name: generator tests\n#+begin_src rust :comments org :noweb yes\n#[test]\nfn from_st_works() {\n  assert_eq!(GeneratorErrorCoreStrategy::from_str(\"ignore\"), Ok(GeneratorErrorCoreStrategy::Ignore));\n  assert_eq!(GeneratorErrorCoreStrategy::from_str(\"fail\"), Ok(GeneratorErrorCoreStrategy::Fail));\n  assert_eq!(GeneratorErrorCoreStrategy::from_str(\"fixed=missing\"), Ok(GeneratorErrorCoreStrategy::Fixed(\"missing\".to_string())));\n}\n#+end_src\n\n\nOkay, now onto the topic of determining an error response strategy.\n\nWe'll be doing this by splitting the concerns into two separate components - first identifying the core strategy and then identifying \nthe use of a default strategy or not.\n\nFirst for the core strategy, we'll set a default and then populate it.\n#+name: high level error strategy\n#+begin_src rust :comments org :noweb yes\nlet mut opt_strat = generator::GeneratorErrorCoreStrategy::Fail;\n#+end_src\n\nUsing the from string implementation we described earlier, we can parse this as follows.\n#+name: high level error args\n#+begin_src rust :comments org :noweb yes\nap.refer(\u0026mut opt_strat)\n  .add_option(\u0026[\"-e\", \"--error\"],\n              argparse::Store,\n              \"Fail on the first undefined parameter\");\n#+end_src\n\nFor the default strategy we'll be using an optional value which we'll try and populate. If it isn't populated then we'll know that\nthere is no default strategy.\n#+name: high level error strategy\n#+begin_src rust :comments org :noweb yes\nlet mut def_strat = None;\n#+end_src\n\nOnce again, for the default we'll just try and populate the string.\n#+name: high level error args\n#+begin_src rust :comments org :noweb yes\nap.refer(\u0026mut def_strat)\n  .add_option(\u0026[\"-d\", \"--default\"],\n              argparse::StoreOption,\n              \"Additional mapping for storing default values. If not specified, the environment variable GOP_HTML_DEFAULTS if defined, is used as a default.\");\n#+end_src\n\nIf none was provided we'll try and retrieve it from the environment under the key ~GOP_HTML_DEFAULTS~.\n#+name: high level error update\n#+begin_src rust :comments org :noweb yes\nlet def_strat = def_strat.or_else(|| env::var(\"GOP_HTML_DEFAULTS\").ok());\n#+end_src\n\nFinally, we can construct the error strategy based on whether a default is provided.\n#+name: high level error update\n#+begin_src rust :comments org :noweb yes\nlet err_strat = match def_strat {\n   None =\u003e \n      generator::GeneratorErrorStrategy::Base(opt_strat),\n   Some(path) =\u003e {\n      let mapping = { \n          \u003c\u003cerror strategy retrieve mapping\u003e\u003e \n      };\n      match mapping {\n        Some((name, map)) =\u003e \n            generator::GeneratorErrorStrategy::Default(map, opt_strat), \n        None =\u003e {\n            eprintln!(\"Encountered error while reading default mapping at {:?}.\", path);\n            generator::GeneratorErrorStrategy::Base(opt_strat)\n        }\n      }\n   }\n};\n#+end_src\n\nNow all we've got to do is retrieve the mapping.\n#+name: error strategy retrieve mapping\n#+begin_src rust :comments org :noweb yes\nlet def_path = Path::new(\u0026path);\nif let Ok(mut file) = File::open(\u0026def_path) {\n   let mut def_source = String::new();\n   if let Ok(_count) = file.read_to_string(\u0026mut def_source) {\n       parser::parse_source_string(\u0026def_source).ok()\n   } else {\n       None\n   }\n} else {\n    None\n}\n#+end_src\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkiranandcode%2Fhtml_gen","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkiranandcode%2Fhtml_gen","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkiranandcode%2Fhtml_gen/lists"}