{"id":50918981,"url":"https://github.com/ntrkd/tei-xml-formatter","last_synced_at":"2026-06-16T18:01:09.569Z","repository":{"id":326425120,"uuid":"1078959804","full_name":"ntrkd/tei-xml-formatter","owner":"ntrkd","description":"A VS Code extension to format TEI XML files","archived":false,"fork":false,"pushed_at":"2026-03-31T21:17:30.000Z","size":170,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2026-05-03T17:06:42.792Z","etag":null,"topics":["formatter","tei-xml","typescript","vscode-extension"],"latest_commit_sha":null,"homepage":"","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ntrkd.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-10-18T19:35:37.000Z","updated_at":"2026-03-31T21:17:35.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/ntrkd/tei-xml-formatter","commit_stats":null,"previous_names":["networkydev/tei-xml-formatter","ntrkd/tei-xml-formatter"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/ntrkd/tei-xml-formatter","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ntrkd%2Ftei-xml-formatter","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ntrkd%2Ftei-xml-formatter/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ntrkd%2Ftei-xml-formatter/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ntrkd%2Ftei-xml-formatter/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ntrkd","download_url":"https://codeload.github.com/ntrkd/tei-xml-formatter/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ntrkd%2Ftei-xml-formatter/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34417416,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-16T02:00:06.860Z","response_time":126,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["formatter","tei-xml","typescript","vscode-extension"],"created_at":"2026-06-16T18:00:30.909Z","updated_at":"2026-06-16T18:01:09.522Z","avatar_url":"https://github.com/ntrkd.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003e This project is under active development, expect ~~breaking~~ massive changes!\n\n## About\n\nThis is a TypeScript library that formats [TEI XML](https://tei-c.org/) files. It uses [saxes](https://github.com/lddubeau/saxes/) to parse XML, format, and then output a formatted string. This formatter expects valid XML files.\n\n## Demonstrations\n\n### Unformatted\n```xml\n\u003c?xml version=\"1.0\" encoding=\"UTF-8\"?\u003e\n\u003c?xml-stylesheet type=\"text/xsl\" href=\"custom.xsl\"?\u003e\n\u003cTEI xmlns=\"http://www.tei-c.org/ns/1.0\"\u003e\n\u003ctext\u003e\u003cbody\u003e\u003cdiv type=\"letter\"\u003e\n\u003chead\u003eLetter from Emily to John\u003c/head\u003e\u003cp\u003e\u003chi rend=\"italic\"\u003eDear John,\u003c/hi\u003e\u003clb/\u003e I hope this letter finds you well.\nThe weather here has been \u003chi rend=\"bold\"\u003eunusually warm\u003c/hi\u003e for October.\n\u003c/p\u003e\u003cp\u003eI have enclosed the sketches you asked for.\n\u003cnote type=\"editorial\"\u003eOriginal note: “See attached drawings.”\u003c/note\u003e\u003c/p\u003e\n\u003ccloser\u003e\u003csalute\u003e Yours sincerely, \u003c/salute\u003e\n\u003csigned\u003eEmily\u003c/signed\u003e\n\u003c/closer\u003e\u003c/div\u003e\u003chi\u003e 80 808 0808 080808080808 0808008 8 8 08 08 08 80 80 80 8080 8080 8008 080 8080 8080 080 0\n\u003c/hi\u003e\u003c/body\u003e\u003c/text\u003e\u003c/TEI\u003e\n```\n\n### Formatted\n```xml\n\u003c?xml version=\"1.0\" encoding=\"UTF-8\"?\u003e\n\u003c?xml-stylesheet type=\"text/xsl\" href=\"custom.xsl\"?\u003e\n\u003cTEI xmlns=\"http://www.tei-c.org/ns/1.0\"\u003e\n\t\u003ctext\u003e\n\t\t\u003cbody\u003e\n\t\t\t\u003cdiv type=\"letter\"\u003e\n\t\t\t\t\u003chead\u003eLetter from Emily to John\u003c/head\u003e\u003cp\u003e\u003chi rend=\"italic\"\u003eDear John,\u003c/hi\u003e\n\t\t\t\t\t\u003clb /\u003e\n\t\t\t\t\tI hope this letter finds you well. The weather here has been\n\t\t\t\t\t\u003chi rend=\"bold\"\u003eunusually warm\u003c/hi\u003e\n\t\t\t\t\tfor October.\n\t\t\t\t\u003c/p\u003e\n\t\t\t\t\u003cp\u003eI have enclosed the sketches you asked for.\n\t\t\t\t\u003cnote type=\"editorial\"\u003eOriginal note: “See attached drawings.”\u003c/note\u003e\u003c/p\u003e\n\t\t\t\t\u003ccloser\u003e \u003csalute\u003e Yours sincerely, \u003c/salute\u003e \u003csigned\u003eEmily\u003c/signed\u003e \u003c/closer\u003e\n\t\t\t\u003c/div\u003e\n\t\t\t\u003chi\u003e\n\t\t\t\t80 808 0808 080808080808 0808008 8 8 08 08 08 80 80 80 8080 8080 8008 080 8080 8080 080 0\n\t\t\t\u003c/hi\u003e\n\t\t\u003c/body\u003e\n\t\u003c/text\u003e\n\u003c/TEI\u003e\n```\n\n## Importing and Usage\nThe package is published on [npmjs](www.npmjs.com/package/tei-xml-fmt). We publish CJS and [ESM](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/import) versions to allow either type of project to import.\n\n```js\n// ESM import\nimport { Formatter } from \"tei-xml-fmt\";\n```\n\n```js\n// CJS require() import\nconst { Formatter } = require('tei-xml-fmt');\n```\n\n```js\n// Make a new instance\nconst texfmt = new Formatter();\n\nlet unformattedXML = '\u003cp\u003e Hello World! Tags that are sufficiently long will be wrapped automatically. \u003c/p\u003e';\n\n// The format function takes in a single string with the text to format.\n// It returns a string with the formatted text.\nlet formattedText = texfmt.format(unformattedXML);\n\nconsole.log(formattedText);\n```\n\n## Algorithm\n\n### TL;DR\nFirst uses saxes to parse the .xml file into code which can be processed easier than raw text. First we construct a tree called an Abstract Syntax Tree (AST). This contains enough information to distinguish between Tag Nodes, Close Tag Nodes, Text, and Spaces. Then we take that, process it a bit more and lower it into a Formatting Tree which strips out even more information down to just Groups (contain Text and Space nodes), Text, and Spacing Nodes (Line Indent/Deindent, Space or Line). From here there is very little information to process and most of the formatting has been done. We render the Formatting Tree into raw text again.\n\n### Steps\n\n1. Construct an editable AST tree from the XML file.\n\n    a. Combine adjacent text nodes into singular text nodes.\n   \n    b. Normalize all spaces ' ', new lines '\\n', and tab lines '\\t' within text to a singlar space.\n   \n    c. A text node containing a single space should be transformed into a Spacing Node. If the text node contains text, trailing and leading spaces become Spacing nodes.\n   \n\t- If the Spacing node will reside next to another Spacing Node, do not insert it.\n\n3. Sanitize the AST using a Zipper to allow for better traversal.\n\n    a. A space node can be \\n, \\t or ' ' as long as it does not reside between two text characters / nodes.\n   \n    b. Spacing nodes should be carried in both directions.\n   \n\t- Carrying means inserting another Spacing node after the next node if the node in front of it can be crossed.\n\t- If we are carrying left, it can cross only open tags. If we are carrying right, it can cross only close tags.\n\t- If the Spacing node will reside next to another Spacing node, do not insert it.\n   \n    c. There should now be a single Spacing node everywhere we can insert spaces into.\n\n5. Translate the AST into a formatting tree.\n   \n    a. Convert all nodes normally into text. Spacing nodes require more attention.\n   \n    b. When we encounter a spacing node, we look backward and forward to see what type of FMT node to insert.\n   \n\t- LineIndent - If the previous tag is an open tag and the next node is not a close tag\n\t- LineDeindent - If the previous tag is not an open tag and the next node is a close tag\n\t- SpaceOrLine - Default\n   \n    // TODO: A group of carried Spacing nodes should be linked together. As if all them dont need to be wrapped, only one of the Spacing nodes needs to become a space. Not all of them.\n\n7. Generate the final XML using the formatting tree.\n\n    a. Use width() calculations on the FMT nodes to determine whether to wrap then output the correct string literal.\n\n## Definitions and Observations\n\n- TEI XML prefers explicit spacing. It defines no standards for how implict spaces are treated. Thus these formatting rules are specific to the renderer used in the Eartha M. M. White project. I would recommend using explicit spacing wherever possible.\n- A singular space is the same as multiple spaces. One spacing node may be expanded to multiple.\n- New lines and tab lines are also treated as spaces.\n- Block tags are tags that make their own spacing during rendering thus ignoring the immediate spacing around them.\n- Inline tags are tags that depend on spacing near them. Having no space means the rendered text might be joined together. However, having even one space between multiple inline tags that aren't interrupted by text means that all of them can have spaces and not change the final layout.\n- I have yet to encouter a tag that has asymmetrical spacing requirements. So for now we disregard them.\n- Ignore everything but open tags, close tags, and text nodes for now. Comments, CDATA, Processing Instruction, and XML Declaration will be implemented at a later date.\n\n## Credits\n\nYorick Peterse - [How to write a code formatter](https://yorickpeterse.com/articles/how-to-write-a-code-formatter/)\n\nGerard Huet - [The Zipper](https://gallium.inria.fr/~huet/PUBLIC/zip.pdf)\n\nTEI Council - [TEI Specification](https://tei-c.org/release/doc/tei-p5-doc/en/html/index.html)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fntrkd%2Ftei-xml-formatter","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fntrkd%2Ftei-xml-formatter","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fntrkd%2Ftei-xml-formatter/lists"}