{"id":17491822,"url":"https://github.com/msprev/pandocinject","last_synced_at":"2026-03-08T13:36:54.539Z","repository":{"id":80295044,"uuid":"50835378","full_name":"msprev/pandocinject","owner":"msprev","description":"grab data, format it, inject it into a pandoc document","archived":false,"fork":false,"pushed_at":"2017-02-23T13:53:26.000Z","size":39,"stargazers_count":5,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-04-22T20:14:50.599Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/msprev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-02-01T11:33:41.000Z","updated_at":"2023-07-26T17:52:44.000Z","dependencies_parsed_at":null,"dependency_job_id":"d0b53cdf-c026-4acd-bb70-9f4f552483a1","html_url":"https://github.com/msprev/pandocinject","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/msprev%2Fpandocinject","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/msprev%2Fpandocinject/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/msprev%2Fpandocinject/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/msprev%2Fpandocinject/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/msprev","download_url":"https://codeload.github.com/msprev/pandocinject/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250316065,"owners_count":21410476,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-19T08:05:16.238Z","updated_at":"2026-03-08T13:36:54.494Z","avatar_url":"https://github.com/msprev.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"pandocinject\n============\n\nImagine you have a list of items in a structured data file (yaml, json, bibtex, etc.). You want to select some of the items, format them nicely, and inject the result into a markdown/html/docx/etc document. For example, you might have a file that includes a list of your talks, and you want to select some of the items, sort them, and format them neatly for your website.\n\nWouldn’t it be nice to do this without learning a funky template/style/query language? Wouldn’t it be nice to say the output you want directly in your favourite format (markdown/html/org/etc.)? And wouldn’t it be nice to use a simple language like Python to do the logic of selecting and formatting?\n\npandocinject does all this. Or rather, pandocinject creates pandoc filters do to this. Creating these filters is trivial. A worked example is given below.\n\npandocinject is 100% Python. Using it only requires a basic knowledge of Python. There are no funky template/style/query languages (SQL, csl, biblatex, etc.) to learn.\n\nInstallation\n============\n\n    pip3 install git+https://github.com/msprev/pandocinject\n\n*Requirements:*\n\n-   [Python 3](https://www.python.org/downloads/)\n-   [pip](https://pip.pypa.io/en/stable/index.html) (included in most Python 3 distributions)\n\n*To upgrade existing installation:*\n\n    pip3 install --upgrade git+https://github.com/msprev/pandocinject\n\nWorked example\n==============\n\nSuppose you have your talks in a yaml file, `talks.yaml`:\n\n``` yaml\n- title: \"My first walk\"\n  year: 2012\n  venue: Edinburgh meadows\n\n- title: \"A walk in the park\"\n  year: 2013\n  venue: London Park\n\n- title: \"Another walk\"\n  year: 2014\n  venue: Central park\n\n- title: \"Walking again\"\n  year: 2015\n  venue: London Park\n```\n\nYou want to select the talks after 2012, and format them nicely.\n\nEasy. Let’s write a filter.\n\nStep 1: Write the selector logic\n--------------------------------\n\nFirst, write a function to select items after 2012. Put this into a file `selector.py`.\n\n``` python\nfrom pandocinject import Selector\n\nclass Since2012(Selector):\n    def select(self, entry):\n        return True if entry['year'] \u003e 2012 else False\n```\n\nStep 2: Write the formatter logic\n---------------------------------\n\nNow, write a function to format this we way we want. Put this into a file `formatter.py`:\n\n``` python\nfrom pandocinject import Formatter\n\nclass Homepage(Formatter):\n    def format_entry(self, entry):\n        text = '*' + entry['title'] + '*, '\n        text += entry['venue'] + ', '\n        text += str(entry['year'])\n        return text\n```\n\nStep 3: Put them together to create a filter\n--------------------------------------------\n\nPut this into a file `inject-talks.py`, in the same directory as the others:\n\n``` python\n#!/usr/bin/env python3\n\nimport importlib\nfrom pandocfilters import toJSONFilter\nfrom pandocinject import Injector\n\nif __name__ == \"__main__\":\n    s = importlib.import_module('selector')\n    f = importlib.import_module('formatter')\n    i = Injector('inject-talk', selector_module=s, formatter_module=f)\n    toJSONFilter(i.get_filter())\n```\n\nRemember to make your filter executable:\n\n    chmod +x inject-talks.py\n\nThe result\n----------\n\nAdd this `div` to your markdown document where you want your talks to appear:\n\n``` html\nThe talks I have given since 2012 include:\n\n\u003cdiv class=\"inject-talk\" source=\"talks.yaml\" select=\"Since2012\" format=\"Homepage\"\u003e\u003c/div\u003e\n```\n\nNow call pandoc with the filter:\n\n    pandoc test.md -t markdown --filter=./inject-talks.py\n\nHere is the markdown output:\n\n``` markdown\nThe talks I have given since 2012 include:\n\n1.  *A walk in the park*, London Park, 2013\n\n2.  *Another walk*, Central park, 2014\n\n3.  *Walking again*, London Park, 2015\n```\n\nWhat about html output for a webpage? No problem:\n\n    pandoc test.md -t html --filter=./inject-talks.py\n\nHere is the html output:\n\n``` html\n\u003cp\u003eThe talks I have given since 2012 include:\u003c/p\u003e\n\u003col style=\"list-style-type: decimal\"\u003e\n\u003cli\u003e\u003cp\u003e\u003cem\u003eA walk in the park\u003c/em\u003e, London Park, 2013\u003c/p\u003e\u003c/li\u003e\n\u003cli\u003e\u003cp\u003e\u003cem\u003eAnother walk\u003c/em\u003e, Central park, 2014\u003c/p\u003e\u003c/li\u003e\n\u003cli\u003e\u003cp\u003e\u003cem\u003eWalking again\u003c/em\u003e, London Park, 2015\u003c/p\u003e\u003c/li\u003e\n\u003c/ol\u003e\n```\n\n\u003c!-- ## Other examples --\u003e\n\u003c!-- You can see more worked examples of formatters and selectors here: --\u003e\n\u003c!-- - inject-student --\u003e\n\u003c!-- - inject-talk --\u003e\n\u003c!-- - inject-publication --\u003e\nDocumentation\n=============\n\nInput document\n--------------\n\nYou inject into pandoc’s input file by using a `div` or `span` with the class name of your filter in that file.\n\n``` html\n\u003cdiv class=\"inject-talk\" source=\"talks.yaml\" select=\"LastYear\" format=\"Homepage\"\u003e\u003c/div\u003e\n\u003cspan class=\"inject-ref\" source=\"pubs.yaml\" select=\"SingleAuthor\" format=\"Keywords\"\u003e\u003c/div\u003e\n```\n\n-   `div` tags are for injecting data formatted as a block\n-   `span` tags are for injecting data formatted as an inline element\n\nYour formatter will likely behave differently depending on whether it is intended to be used for injecting block or inline elements. Note that the default formatter (in base class `Formatter`) injects block elements (loose numbered lists).\n\nThe `div` or `span` tag has three attributes that control what gets injected: `source`, `select`, `format`:\n\n1.  `source`: source file(s) from which to read data\n2.  `select`: boolean string of Python classes to select items from the data\n3.  `format`: Python class to format those items into a string\n\n### `source`\n\nThe `source` attribute takes a list of space-separated file names or paths. Files at the start are read before files at the end. The file type is inferred from the file name’s extension.\n\nFile types currently supported:\n\n-   yaml (`'.yaml'`)\n-   json (`'.json'`)\n-   bibtex (`'.bib'`)\n\n### `select`\n\nThe `select` attribute takes a boolean expression involving names of Python classes – ‘selector’ classes. You can create these classes by subclassing `Selector` from module `pandocinject` and changing the result to suit your needs.\n\n`select` may consist of the name of a single selector class or a boolean expression that involves the names of multiple classes. A space-separated list is equivalent to a boolean expression where each item is joined with `AND`.\n\n``` html\n\u003cdiv class=\"inject-talk\" source=\"talks.yaml\" select=\"JointAuthor\" format=\"Homepage\"\u003e\u003c/div\u003e\n\u003cdiv class=\"inject-talk\" source=\"talks.yaml\" select=\"Paper LastYear\" format=\"Homepage\"\u003e\u003c/div\u003e\n\u003cdiv class=\"inject-talk\" source=\"talks.yaml\" select=\"(LastYear OR Forthcoming) AND Paper AND NOT JointAuthor\" format=\"Homepage\"\u003e\u003c/div\u003e\n```\n\nValid boolean operators include:\n\n-   `AND`, `and`\n-   `OR`, `or`\n-   `NOT`, `not`\n\nBrackets can be used to group expressions.\n\nSometimes you want to select a particular item. You do not need to write a custom selector class to do this. panzerinject will create a selector for a single item on the fly based on identifying attributes: `uuid`, `slug`, `ID`.\n\n``` html\n\u003cdiv class=\"inject-talk\" source=\"talks.yaml\" select=\"uuid=6342F747-4294-4036-BE77-10364924164D\" format=\"Homepage\"\u003e\u003c/div\u003e\n\u003cdiv class=\"inject-talk\" source=\"talks.yaml\" select=\"slug=my-great-talk\" format=\"Homepage\"\u003e\u003c/div\u003e\n\u003cdiv class=\"inject-talk\" source=\"talks.yaml\" select=\"ID=talk208\" format=\"Homepage\"\u003e\u003c/div\u003e\n```\n\nIn order for this to work, your item must have an `uuid`, `slug`, or `ID` attribute.\n\nuuid/slug/ID selectors can be freely mixed with other selectors in boolean expressions.\n\n### `format`\n\nThe `format` attribute takes the name of a Python class – the ‘formatter’ class. You can create a formatter class by subclassing `Formatter` from module `pandocinject` and tweaking the result to suit your needs.\n\n``` html\n\u003cdiv class=\"inject-talk\" source=\"talks.yaml\" select=\"JointAuthor\" format=\"Homepage\"\u003e\u003c/div\u003e\n\u003cdiv class=\"inject-talk\" source=\"talks.yaml\" select=\"JointAuthor\" format=\"CV\"\u003e\u003c/div\u003e\n\u003cspan class=\"inject-talk\" source=\"talks.yaml\" select=\"JointAuthor\" format=\"Abstract\"\u003e\u003c/div\u003e\n```\n\n### `star`\n\nThe `star` attribute takes a list of space-separated uuids/slugs/IDs. Items that match those uuids/slugs/IDs will be starred. This local change is equivalent to the global change of adding the item’s uuid/slug/ID to the `star` metadata variable, with the only difference being that the latter affects the entire document.\n\nInput document metadata\n-----------------------\n\n### `star`\n\nSometimes you may want to mark out certain entries as special. For example, you may wish to star certain entries when they appear in the document.\n\nIf the input document contains a metadata variable `star`, which contains a list of uuids or slugs or IDs, any items with those identifiers will be starred if injected.\n\n``` yaml\nstar:\n    - \"6342F747-4294-4036-BE77-10364924164D\"\n    - \"my-new-york-talk\"\n```\n\nWhat ‘being starred’ means is determined by the formatter class. The default formatter prepends an asterisk (‘`*`’) to a starred item.\n\nPython classes\n--------------\n\n`Injector`\n----------\n\nObjects from this class create pandoc filters. You need to instantiate one of these to create a pandoc filter.\n\n-   `Injector(name, selector_module, formatter_module)`:\n    -   Returns:\n        -   An Injector object\n    -   Arguments:\n        -   `name`: name of `class` of `\u003cdiv\u003e` or `\u003cspan\u003e` tags where injector will insert text\n        -   `selector_module`: module with the selector classes for the injector\n        -   `formatter_module`: module with formatter classes for the injector\n    -   Default:\n        -   `selector_module`: Default (base) class: selects everything in source file\n        -   `formatter_module`: Default (base) class: formats entries as loose numbered list\n-   `get_filter(self)`:\n    -   Returns:\n        -   Function that is a pandoc filter; can be passed to `toJSONFilter`\n\n`Selector`\n----------\n\nYou write a selector or formatter by subclassing `Selector` or `Formatter` as imported from module `pandocinject`.\n\nThese classes have methods that you are likely to wish to override for your own formatter or selector.\n\n-   `select(self, entry)`:\n    -   Returns:\n        -   `True` if `entry` is to be selected for injection into document, `False` otherwise\n    -   Arguments:\n        -   `entry`: Item (dictionary) to assess for selection\n    -   Default:\n        -   Return `True`\n\n`Formatter`\n-----------\n\n-   `output_format`: Format of the string that `format_block` returns\n    -   Value:\n        -   Any of pandoc’s output formats (`'-o'`) (e.g. `'html'`, `'org'`).\n    -   Default:\n        -   `'markdown'`\n-   `format_block(self, entries, starred)`:\n    -   Returns:\n        -   Formatted version of `entries` (string)\n    -   Arguments:\n        -   `entries`: List of items (sorted)\n        -   `starred`: List of items to star\n    -   Default:\n        -   Return a markdown string with loose numbered list of entries, each formatted by `format_entry`; star items by inserting a preceding asterisk\n-   `format_entry(self, entry)`:\n    -   Returns:\n        -   Formatted version of `entry` (string)\n    -   Arguments:\n        -   `entry`: Item (dictionary) to be formatted\n    -   Default:\n        -   Return Python’s string representation of `entry`\n-   `sort_entries(self, entries)`:\n    -   Returns:\n        -   List of items in order they should be formatted, first to last\n    -   Arguments:\n        -   `entries`: List of items to sort\n    -   Default:\n        -   Return `entries` unchanged\n\nSimilar\n=======\n\nA large number of tools can accomplish the same. But here there are no funky template/style/query languages (SQL, csl, biblatex, etc.) to learn. The main feature of pandocinject is it provides a simple, general way to mine lists of items and inject the result into pandoc’s abstract syntax tree.\n\nRelease notes\n=============\n\n-   1.0 (28 January 2016):\n    -   implement boolean language for `select` attribute\n    -   documentation complete\n    -   clean up\n-   0.1 (24 November 2015):\n    -   initial release\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmsprev%2Fpandocinject","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmsprev%2Fpandocinject","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmsprev%2Fpandocinject/lists"}