{"id":15442774,"url":"https://github.com/kosarev/tproc","last_synced_at":"2025-04-19T20:02:18.915Z","repository":{"id":62585058,"uuid":"143117360","full_name":"kosarev/tproc","owner":"kosarev","description":"A small yet powerful text processor in Python","archived":false,"fork":false,"pushed_at":"2020-10-17T09:55:16.000Z","size":38,"stargazers_count":8,"open_issues_count":2,"forks_count":2,"subscribers_count":2,"default_branch":"master","last_synced_at":"2024-10-18T13:15:22.746Z","etag":null,"topics":["macro-processor","mit-license","preprocessor","python","python-generators","template-processor","text-processor","word-processor"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kosarev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-08-01T07:06:06.000Z","updated_at":"2023-01-04T11:44:25.000Z","dependencies_parsed_at":"2022-11-03T21:45:55.988Z","dependency_job_id":null,"html_url":"https://github.com/kosarev/tproc","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kosarev%2Ftproc","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kosarev%2Ftproc/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kosarev%2Ftproc/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kosarev%2Ftproc/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kosarev","download_url":"https://codeload.github.com/kosarev/tproc/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240109567,"owners_count":19749170,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["macro-processor","mit-license","preprocessor","python","python-generators","template-processor","text-processor","word-processor"],"created_at":"2024-10-01T19:30:00.429Z","updated_at":"2025-03-03T00:32:14.269Z","avatar_url":"https://github.com/kosarev.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# tproc\n\nA small yet powerful text processor written in Python.\n\n[![Build Status](https://travis-ci.org/kosarev/tproc.svg?branch=master)](https://travis-ci.org/kosarev/tproc)\n\n## Features:\n\n* Provides a way to program your documentation.\n\n* Unleashes the full power of Python for organizing, generating,\nvalidating and debugging your data. Supports arbitrary Python\ncode and modules. No new languages to learn.\n\n* Interleaved text and code. The order of definitions is up to you.\n\n* Text pieces are implicitly defined as functions that can be\ncalled from anywhere in the input file as well as from an\nexternal code having access to the processor object.\n\n* Supports Python 2.7 and 3.\n\n* Available under the MIT license.\n\n\n## Contents\n\n* [Installation](#installation)\n* [Hello world](#hello-world)\n* [Definitions](#definitions)\n* [Replacement fields](#replacement-fields)\n* [Format specifiers](#format-specifiers)\n* [Passing data to generators](#passing-data-to-generators)\n* [Escape sequences](#escape-sequences)\n* [Tokens](#tokens)\n* [Generation of non-text data](#generation-of-non-text-data)\n* [Namespaces and processor objects](#namespaces-and-processor-objects)\n* [API](#api)\n* [Basic design principles](#basic-design-principles)\n\n\n## Installation\n\n```shell\npip install tproc\n```\n\n\n## Hello world\n\n```python\n# hello.tproc\n\n@hello\nHello {world}\n\n@world\nWorld!\n\n@main\n{hello}\n```\n\nProcessing:\n\n```\n$ tproc hello.tproc\nHello World!\n```\n\nThe input contains three definitions, each expanding into its\nbody text. The names in curly braces are replaced with the body\nof the corresponding definition.\n\nNote that tproc only expands input on request, and not as it\nreads and processes the definitions. Because of this, the\ndefinitions may come in any order as seems best for your needs.\n\nWhitespace just before and just after definition bodies is\nstripped, so all the three definitions in the example produce\ninline output with no new-line characters.\n\nThe part of the input before the first definition is ignored, and\nsupposed to be used for describing the purpose of the input and\nother relevant information.\n\n\n## Definitions\n\ntproc translates text definitions into Python generators that\nproduce the body text in its original form, that is, before any\nexpansion. This makes it possible to write definitions as normal\nPython functions, like this:\n\n```python\n#!/usr/bin/env tproc\n\n@\ndef hello():\n    yield 'Hello {'\n    yield 'world'\n    yield '}'\n\n@world\nWorld!\n\n@main\n{hello}\n```\n\nOutput:\n\n```\nHello World!\n```\n\nCustom generators can yield the whole piece of data at once or\ngenerate it by chunks of arbitrary size.\n\n\n## Replacement fields\n\nReplacement fields are portions of text surrounded with curly\nbraces that tproc replaces with some other content during\nexpansion process. For example:\n\n```python\n@email\ninfo@{domain}\n\n@domain\nexample.com\n```\n\nSuch simplest replacement fields contain the name of a text\ndefinition or of a custom generator (which is the same). But they\nin fact can be arbitrary expressions:\n\n```python\n@\nimport time\n\n@main\nHappy {time.strftime('%A')}!\n```\n\nOn Fridays this results into:\n\n```\nHappy Friday!\n```\n\nNote that the value of a replacement field is evaluated every\ntime the field is expanded, and it is expanded every time tproc\nencounters its invocation, so such values are never cached. This\nallows generators to produce different content for different\ninvocations, like in this example:\n\n```python\n@\ncounter = 0\n\ndef count():\n    global counter\n    yield '%d' % counter\n    counter += 1\n\n@main\n{count} {count} {count}\n```\n\nOutput:\n\n```\n0 1 2\n```\n\nTo guarantee reproducible results invocations of replacement\nfields are always processed in the left-to-right order.\n\n\n## Format specifiers\n\nIn addition to value expressions, replacement fields may contain\nformat specifiers:\n\n```python\n@title\nESIO TROT\n\n@main\n{title:-^15}\n```\n\nGenerates:\n\n```\n---ESIO TROT---\n```\n\nAs you may guess, the syntax of format specifiers is the same as\nfor the lovely `format()` function.\n\n\n## Passing data to generators\n\nIn replacement fields, portions of data delimited with colons may\nfollow (possibly empty) format specifiers. Each such piece of\ndata will then be passed as an argument to the generator. For\nexample:\n\n```python\n@\ndef section(title, body):\n    yield '\u003csection\u003e'\n    yield '\u003ctitle\u003e'\n    for chunk in title:\n        yield chunk\n    yield '\u003c/title\u003e'\n    yield '\u003cbody\u003e'\n    for chunk in body:\n        yield chunk\n    yield '\u003c/body\u003e'\n    yield '\u003c/section\u003e'\n\n@main\n{section::NAME:tproc - A text processor}\n{section::SYNOPSIS:tproc [-e DEFINITION] [infile] [outfile]}\n```\n\nThis gives:\n\n```\n\u003csection\u003e\u003ctitle\u003eNAME\u003c/title\u003e\u003cbody\u003etproc - A text processor\u003c/body\u003e\u003c/section\u003e\n\u003csection\u003e\u003ctitle\u003eSYNOPSIS\u003c/title\u003e\u003cbody\u003etproc [-e DEFINITION] [infile] [outfile]\u003c/body\u003e\u003c/section\u003e\n```\n\nAnd of course such arguments can nest and each of the nested\narguments gets expanded before passing to the generator:\n\n```python\n@\ndef p(body):\n    yield '\u003cp\u003e'\n    for chunk in body:\n        yield chunk\n    yield '\u003c/p\u003e'\n\ndef i(body):\n    yield '\u003ci\u003e'\n    for chunk in body:\n        yield chunk\n    yield '\u003c/i\u003e'\n\n@main\n{p::It is {i::crucial} to support nested arguments.}\n```\n\n\n## Escape sequences\n\nTo support nested arguments it is necessary that curly braces and\ncolons preserve their special meaning everywhere within bodies of\ntext definitions. But that also means there should be a way to\nspecify the brace and colon characters in its literal meaning,\nthat is, as part of the body text. Escape sequences is the way to\ndo that.\n\nEscape sequences start with slash (`\\`) followed by the character\nto escape. For example:\n\n\u003c!-- In 'python' mode this block highlights 'wrong' escape\n     sequences. --\u003e\n```\n@\n@main\nThis example:\n\n{code::\n#include \u003ciostream\u003e\n\nint main() \\{\n    std\\:\\:cout \u003c\u003c \"@ Hey! @\" \u003c\u003c std\\:\\:endl;\n\\}\n}\n\njust prints:\n\n\\@ Hey! \\@\n\n@\ndef code(source):\n    yield '```'\n    for chunk in source: yield chunk\n    yield '```'\n```\n\nTo represent non-printable characters and for better\ninterchangeability with other sources and consumers of textual\ndata, tproc also supports the standard C escape sequences:\n\n`\\\\` `\\'` `\\\"` `\\a` `\\b` `\\f` `\\n` `\\r` `\\t` `\\v`\n\n\n## Tokens\n\nConsider this:\n\n```python\n@main\n'{echo:: {echo:: \\: } }'\n\n@\ndef echo(content):\n    return content\n```\n\nThe code seems obvious: the inner `echo` invocation gets expanded\ninto a colon character surrounded by spaces, which then becomes\nthe argument of the outer invocation that too replicates the\ncolon adding some more spaces around it, resulting in:\n\n```\n'  :  '\n```\n\nHowever, if the inner `echo` gets its argument containing the\ncolon in its literal de-escaped form, which is so, then why that\ncolon character doesn't work as an argument delimiter when it's\npassed to the outer `echo`?\n\nThe answer is that before an expansion takes place, all\ncharacters that form the sequence to expand are converted into\ntokens. Curly braces designating bounds of replacement fields and\ncolons separating format specifiers and arguments within them\nbecome delimiter tokens and all other data becomes literal\ntokens. Being parsed, tokens preserve their meaning until the\nvery end of the expansion process, so once the escaped colon\ncharacter in the example above becomes part of a literal token,\nit will always be considered as part of text, and not as a\ndelimiter.\n\nLet's change the example a bit to see what the generators\nactually get:\n\n```python\n@main\n{eat:: '{outer:: {inner:: \\: } }' }\n\n@\ninner_chunks = []\nouter_chunks = []\n\ndef inner(content):\n    for chunk in content:\n        inner_chunks.append(chunk)\n        yield chunk\n\ndef outer(content):\n    for chunk in content:\n        outer_chunks.append(chunk)\n        yield chunk\n\ndef eat(content):\n    for chunk in content:\n        pass\n\n    print('inner: %r' % inner_chunks)\n    print('outer: %r' % outer_chunks)\n    yield ''\n```\n\nThe output:\n\n```\ninner: [\u003cliteral ' '\u003e, \u003cliteral ':'\u003e, \u003cliteral ' '\u003e]\nouter: [\u003cliteral ' '\u003e, \u003cliteral ' '\u003e, \u003cliteral ':'\u003e, \u003cliteral ' '\u003e, \u003cliteral ' '\u003e]\n```\n\nFor both the inner and outer invocations the content is a\nsequence of literal tokens containing spaces and colon\ncharacters. Curly braces and colons that work as delimiters are\nconsumed and processed by tproc accordingly to their meaning.\n\nIn terms of code, literal tokens are instances of class\n`LiteralToken` that have a public member `.content` that stores\nthe literal as a string.\n\n\n# Generation of non-text data\n\nAs we already said, the value of a replacement field can be any\nexpression. If it evaluates to something callable, it is called\nand the returned value is considered as the field value. Then, if\nthe value is a generator, it becomes the source of the value\nchunks. Any other values are converted into literal tokens with\nthe `.content` field storing the original value.\n\nHere's how it works:\n\n```python\n@content\n{55} {[5, 7, 9]} {tuple(range(3))} {'{year}'}\n# {lambda\\: [(yield [11] * 5)]}\n\n@year\n2018\n\n@main\n{dump::{content}}\n\n@\ndef dump(content):\n    for chunk in content:\n        print('%r' % chunk)\n\n    yield ''\n```\n\nThe values of the replacement fields in `content` are evaluated\nand expanded, and then passed to `dump` as a sequence of literal\ntokens:\n\n```\n\u003cliteral 55\u003e\n\u003cliteral ' '\u003e\n\u003cliteral [5, 7, 9]\u003e\n\u003cliteral ' '\u003e\n\u003cliteral (0, 1, 2)\u003e\n\u003cliteral ' '\u003e\n\u003cliteral '2018'\u003e\n\u003cliteral '\\n# '\u003e\n\u003cliteral [11, 11, 11, 11, 11]\u003e\n```\n\nOn full expansion, tokens are converted back to their literals and appear\nin the resulting output in their stringized form:\n\n```python\n@main\n{55} {[5, 7, 9]} {tuple(range(3))} {'{year}'}\n# {lambda\\: [(yield [11] * 5)]}\n\n@year\n2018\n```\n\n```\n55 [5, 7, 9] (0, 1, 2) 2018\n# [11, 11, 11, 11, 11]\n```\n\nUsing nested replacements lists that expand into non-text data\nmakes it possible to translate custom markups directly into\nPython data structures. For example:\n\n```python\n@main\n{section::TITLE:\n{p::\nFirst paragraph.}\n{p::\nSecond paragraph.}\n}\n\n@\ndef collect(tokens):\n    return [x.content for x in tokens]\n\ndef p(body):\n    yield ('p', collect(body))\n\ndef section(title, body):\n    yield ('section', collect(title), collect(body))\n```\n\nResults in:\n\n```\n('section', ['TITLE'], ['\\n', ('p', ['\\nFirst paragraph.']), '\\n', ('p', ['\\nSecond paragraph.']), '\\n'])\n```\n\n\n## Namespaces and processor objects\n\nEvery processor instance has its own space for global names. This\nnamespace is independent of the tproc's code namespace so users\nare free to name their generators and other global entities as\nthey like.\n\nThe only name that comes predefined in the input's code namespace\nis `tproc`. That name refers to the processor object that handles\nthe input source. Through this name the input code can access the\npublic API of the processor class described in the corresponding\nsection below. For example, `tproc.LiteralToken` refers to the\ntype of tokens passed to generators that have arguments:\n\n```python\n@main\n{'%r' % tproc.LiteralToken}\n```\n\n```\n\u003cclass 'tproc.LiteralToken'\u003e\n```\n\n\n## API\n\n### `tproc.LiteralToken`\n\n* `LiteralToken.content`\n\n  Contains the literal of the token as a string.\n\n### `tproc.Processor`\n\n* `Processor.expand(input)`\n\n   Returns a generator producing a fully expanded input. The\n   `input` parameter is a generator of source data.\n\n* `Processor.LiteralToken`\n\n   The type of literal tokens. See `tproc.LiteralToken`.\n\n\n## Basic design principles\n\n* Input files are Python programs, presented in a form suitable\nfor text processing. They may import, define and execute\narbitrary Python code as they get processed. They may define a\n`main()` function to implement the default action.\n\n* All sources of input data, including text definitions, are\nPython generators. Similarly, the `Processor.expand()` method is\na generator producing output data. The data is consumed and\ngenerated in chunks that may be of any type and size. String\nchunks are subject to expansion. Chunks of other types are passed\nto the output without any additional processing unless the they\nconstitute an input of a custom generator.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkosarev%2Ftproc","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkosarev%2Ftproc","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkosarev%2Ftproc/lists"}