{"id":20425421,"url":"https://github.com/catseye/dissociated-parse","last_synced_at":"2026-05-10T15:47:01.322Z","repository":{"id":142239905,"uuid":"428042076","full_name":"catseye/Dissociated-Parse","owner":"catseye","description":"MIRROR of https://codeberg.org/catseye/Dissociated-Parse : Adapting the \"Dissociated Press\" algorithm to parse trees, for NaNoGenMo 2021","archived":false,"fork":false,"pushed_at":"2021-12-07T08:28:10.000Z","size":99,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-01-15T15:12:49.242Z","etag":null,"topics":["dissociated-press","generative-text","markov-chain","nonsense","parse-trees","procedural-generation"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/catseye.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-11-14T21:09:31.000Z","updated_at":"2024-03-04T10:06:05.000Z","dependencies_parsed_at":"2023-03-23T10:29:22.635Z","dependency_job_id":null,"html_url":"https://github.com/catseye/Dissociated-Parse","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/catseye%2FDissociated-Parse","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/catseye%2FDissociated-Parse/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/catseye%2FDissociated-Parse/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/catseye%2FDissociated-Parse/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/catseye","download_url":"https://codeload.github.com/catseye/Dissociated-Parse/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241967056,"owners_count":20050330,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dissociated-press","generative-text","markov-chain","nonsense","parse-trees","procedural-generation"],"created_at":"2024-11-15T07:13:15.059Z","updated_at":"2026-05-10T15:46:56.281Z","avatar_url":"https://github.com/catseye.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"Dissociated Parse\n=================\n\n_See also:_ [The Swallows](https://github.com/catseye/The-Swallows#readme) (2013)\n∘ [NaNoGenLab](https://github.com/catseye/NaNoGenLab#readme) (2014)\n∘ [MARYSUE](https://github.com/catseye/MARYSUE#readme) (2015)\n∘ [2017 Entries](https://github.com/catseye/NaNoGenMo-Entries-2017#readme)\n∘ [2018 Entries](https://github.com/catseye/NaNoGenMo-Entries-2018#readme)\n∘ [2019 Entries](https://github.com/catseye/NaNoGenMo-Entries-2019#readme)\n\n- - - -\n\nA submission for NaNoGenMo 2021 (Issue [#62][]).\n\nIt's well known that Markov chains don't understand grammar; any sequences\nin the output that might look grammatical are only there because\ngrammatical-looking sequences are statistically likely.\n\nThis is an experimental variation on a Markov generator that _does_ retain\nsome of the syntactic structure of the original text.\n\nTurns out we can run the Dissociated Press algorithm, not just on a list\nof words like usual, but on a forest of parse trees.  I call this variation\n**Dissociated Parse**.\n\nFor more information on the technique, see\n[the algorithm section](#the-algorithm) below.\n\nA 50,277-word novel generated using this technique can be found here:\n[The Lion, the Witches, and the Weird Road](generated/The%20Lion,%20the%20Witches,%20and%20the%20Weird%20Road.md).\n\n## To run\n\nTo replicate what's been generated here, you'll need to have installed:\n\n*   Python 3.  (I used the version that ships with Ubuntu 20.04,\n    which is Python 3.8.10.)\n*   The `link-parser` executable.  Sources can be found on GitHub:\n    [opencog/link-grammar](https://github.com/opencog/link-grammar).\n    I built it from source.  YMMV.\n*   `t-rext`.  The latest version runs only under Python 2.7.  I ran\n    it from a Docker container: https://hub.docker.com/r/catseye/t-rext\n*   `pandoc`.  This was just to make the HTML version from the generated\n    Markdown.  I installed it using `apt`.\n\nAfter you have the executables, you can:\n\n    virtualenv --python=python3.8 venv\n    source venv/bin/activate\n    pip install -r requirements.txt\n\n    mkdir -p download data\n    ./01_fetch.py\n    ./02_scrape.py download/*\n    ./03_sentencify.py\n    ./04_parse.py                # this one will take a while\n    ./05_build.py\n    ./06_traverse.py 'Give your Novel a Title Here' \u003eout.md\n    wc -w out.md\n    t-rext out.md \u003e out2.md\n    python3 cleanup.py out2.md \u003e 'Your Novel.md'\n    pandoc --from=markdown --to=html5 \u003c'Your Novel.md' \u003e'Your Novel.html'\n    firefox 'Your Novel.html'\n\n## The algorithm\n\nFor background, a description of the Dissociated Press algorithm.\n\nJust as there is more than one algorithm for sorting, there is more than\none algorithm for generating a Markov chain.\n\nThe usual algorithm involves analyzing the source text and building\na weighted transition table, then finding a random (but likely) path\nthrough this table.\n\nThe probably less well-known algorithm called [Dissociated Press][] goes\nlike this:\n\n1. load all the words into a list in memory\n2. select some word as the starting word\n3. print the current word\n4. find all occurrences of the current word in the text\n5. select one of those occurrences at random and jump to it\n6. move to the next word in the text\n7. repeat from step 3\n\nEven though this works rather differently from the transition table\nalgorithm, it produces the same result.  (Exercise for the reader:\nconvince yourself that it does in fact produce the same result.)\n\nOne downside of this algorithm is that it requires the entire text\nbe kept in memory, rather than just a transition table.  But, this\nis also an upside, in the sense that variations on the algorithm can\nexploit structure in the text which would not be retained in\nthe transition table.\n\nNow, Dissociated Parse adapts this to work recursively on parse trees.\nConsider a parse tree to consist of a word, a part-of-speech tag,\nand zero or more child trees.  Here is a sketch of the algorithm:\n\n    traverse(tree):\n        1. find all trees that have the same part-of-speech tag and first word as the current tree\n        2. select one of those trees at random and use that tree as the current tree\n        3. print the word of the current tree\n        4. for each child of the current tree, traverse(child)\n\n## Related work\n\nA previous experiment in adding structure to Markov chains, also during NaNoGenMo, was\n[Anne of Green Garbles][], which showed that one can combine two Markov models\nto obtain a third model where the generation can switch between discrete states\n(like \"in narration\" and \"in dialogue\").\n\n[#62]: https://github.com/NaNoGenMo/2021/issues/62\n[Dissociated Press]: https://en.wikipedia.org/wiki/Dissociated_press\n[Anne of Green Garbles]: https://github.com/catseye/NaNoGenMo-Entries-2019/tree/master/Anne%20of%20Green%20Garbles#readme\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcatseye%2Fdissociated-parse","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcatseye%2Fdissociated-parse","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcatseye%2Fdissociated-parse/lists"}