{"id":27185953,"url":"https://github.com/bdusell/nondeterministic-stack-rnn","last_synced_at":"2025-08-12T21:14:06.196Z","repository":{"id":73626693,"uuid":"301515190","full_name":"bdusell/nondeterministic-stack-rnn","owner":"bdusell","description":"Code for the paper \"The Surprising Computational Power of Nondeterministic Stack RNNs\" (DuSell and Chiang, 2023)","archived":false,"fork":false,"pushed_at":"2024-03-21T13:04:12.000Z","size":381,"stargazers_count":18,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-09T18:08:04.390Z","etag":null,"topics":["conll2020","deep-learning","iclr2022","iclr2023","language-model","machine-learning","neural-networks","nlp","pytorch","rnn","stack"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bdusell.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2020-10-05T19:16:52.000Z","updated_at":"2024-06-07T12:38:31.000Z","dependencies_parsed_at":"2023-09-02T02:24:30.335Z","dependency_job_id":null,"html_url":"https://github.com/bdusell/nondeterministic-stack-rnn","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/bdusell/nondeterministic-stack-rnn","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bdusell%2Fnondeterministic-stack-rnn","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bdusell%2Fnondeterministic-stack-rnn/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bdusell%2Fnondeterministic-stack-rnn/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bdusell%2Fnondeterministic-stack-rnn/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bdusell","download_url":"https://codeload.github.com/bdusell/nondeterministic-stack-rnn/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bdusell%2Fnondeterministic-stack-rnn/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":270135397,"owners_count":24533279,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-12T02:00:09.011Z","response_time":80,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["conll2020","deep-learning","iclr2022","iclr2023","language-model","machine-learning","neural-networks","nlp","pytorch","rnn","stack"],"created_at":"2025-04-09T17:55:35.105Z","updated_at":"2025-08-12T21:14:06.172Z","avatar_url":"https://github.com/bdusell.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# The Surprising Computational Power of Nondeterministic Stack RNNs\n\nThis repository contains the code for the paper\n[\"The Surprising Computational Power of Nondeterministic Stack RNNs\"](https://openreview.net/forum?id=o58JtGDs6y)\n(DuSell and Chiang, 2023).\nIt includes all of the code necessary to reproduce the experiments and figures\nused in the paper, as well as a Docker image definition that can be used to\nreplicate the software environment it was developed in.\n\nIf you are looking for the code for our earlier paper\n[\"Learning Hierarchical Structures with Differentiable Nondeterministic Stacks\"](https://openreview.net/forum?id=5LXw_QplBiF)\n(DuSell and Chiang, 2022), please see\n[this release](https://github.com/bdusell/nondeterministic-stack-rnn/tree/iclr2022).\n\nIf you are looking for the code for our earlier paper\n[\"Learning Context-free Languages with Nondeterministic Stack RNNs\"](https://aclanthology.org/2020.conll-1.41/)\n(DuSell and Chiang, 2020), please see\n[this release](https://github.com/bdusell/nondeterministic-stack-rnn/tree/conll2020).\n\nThis repository includes PyTorch implementations of the following models:\n\n* The\n  [Renormalizing Nondeterministic Stack RNN (RNS-RNN)](src/stack_rnn_models/nondeterministic_stack.py),\n  including the minor changes described in this paper (bottom symbol fix and\n  asymptotic speedup).\n* The \n  [Vector RNS-RNN (VRNS-RNN)](src/stack_rnn_models/vector_nondeterministic_stack.py)\n  introduced in this paper.\n* Memory-limited versions of the\n  [RNS-RNN](src/stack_rnn_models/limited_nondeterministic_stack.py)\n  and\n  [VRNS-RNN](src/stack_rnn_models/limited_vector_nondeterministic_stack.py)\n  that run in linear time, which is useful for natural language modeling.\n* The\n  [superposition stack RNN](src/stack_rnn_models/joulin_mikolov.py)\n  from\n  [\"Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets\"](https://proceedings.neurips.cc/paper/2015/file/26657d5ff9020d2abefe558796b99584-Paper.pdf) (Joulin and Mikolov, 2015).\n* The\n  [stratification stack RNN](src/stack_rnn_models/grefenstette.py)\n  from\n  [\"Learning to Transduce with Unbounded Memory\"](https://proceedings.neurips.cc/paper/2015/file/b9d487a30398d42ecff55c228ed5652b-Paper.pdf) (Grefenstette et al., 2015).\n\n## Directory Structure\n\n* `data/`: Contains datasets used for experiments, namely the PTB language\n  modeling dataset.\n* `experiments/`: Contains scripts for reproducing all of the experiments and\n  figures presented in the paper.\n  * `capacity/`: Scripts for the capacity experiments in Section 5.\n  * `non-cfls/`: Scripts for the non-CFL experiments in Section 4.\n  * `ptb/`: Scripts for the PTB language modeling experiments in Section 6.\n* `scripts/`: Contains helper scripts for setting up the software environment,\n  building container images, running containers, installing Python packages,\n  preprocessing data, etc. Instructions for using these scripts are below.\n* `src/`: Contains source code for all models, training routines, plotting\n  scripts, etc.\n* `tests/`: Contains unit tests for the code under `src/`.\n\n## Installation and Setup\n\nIn order to foster reproducibility, the code for this paper was developed and\nrun inside of a [Docker](https://www.docker.com/) container defined in the file\n[`Dockerfile-dev`](Dockerfile-dev). To run this code, you can build the\nDocker image yourself and run it using Docker. Or, if you don't feel like\ninstalling Docker, you can simply use `Dockerfile-dev` as a reference for\nsetting up the software environment on your own system. You can also build\nan equivalent [Singularity](https://sylabs.io/docs/#singularity) image which\ncan be used on an HPC cluster, where it is likely that Docker is not available\nbut Singularity is.\n\nIn any case, it is highly recommended to run most experiments on a machine with\naccess to an NVIDIA GPU so that they finish within a reasonable amount of time.\nThe exception to this is the experiments for the baseline models (LSTM,\nsuperposition stack LSTM, and stratification stack LSTM) on the formal language\nmodeling tasks, as they finish more quickly on CPU rather than GPU and should\nbe run in CPU mode.\n\n### Using Docker\n\nIn order to use the Docker image, you must first\n[install Docker](https://www.docker.com/get-started).\nIf you intend to run any experiments on a GPU, you must also ensure that your\nNVIDIA driver is set up properly and install the\n[NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html).\n\nIn order to automatically pull the public Docker image, start the container,\nand open up a bash shell inside of it, run\n\n    $ bash scripts/docker-shell.bash --pull\n\nIf you prefer to build the image from scratch yourself, you can run\n\n    $ bash scripts/docker-shell.bash --build\n\nAfter you have built the image once, there is no need to do so again, so\nafterwards you can simply run\n\n    $ bash scripts/docker-shell.bash\n\nBy default, this script starts the container in GPU mode, which will fail if\nyou are not running on a machine with a GPU. If you only want to run things in\nCPU mode, you can run\n\n    $ bash scripts/docker-shell.bash --cpu\n\nYou can combine this with the `--pull` or `--build` options.\n\n### Using Singularity\n\nIf you use a shared HPC cluster at your institution, it might not support\nDocker, but there's a chance it does support Singularity, which is an\nalternative container runtime that is more suitable for shared computing\nenvironments.\n\nIn order to run the code in a Singularity container, you must first obtain the\nDocker image and then convert it to a `.sif` (Singularity image) file on a\nmachine where you have root access (e.g. your personal computer or\nworkstation). This requires installing both Docker and\n[Singularity](https://docs.sylabs.io/guides/latest/user-guide/quick_start.html)\non that machine. Assuming you have already built the Docker image according to\nthe instructions above, you can use the following to create the `.sif` file:\n\n    $ bash scripts/build-singularity-image.bash\n\nThis will create the file `nondeterministic-stack-rnn-2023.sif`. It is normal\nfor this to take several minutes. Afterwards, you can upload the `.sif` file to\nyour HPC cluster and use it there.\n\nYou can open a shell in the Singularity container using\n\n    $ bash scripts/singularity-shell.bash\n\nThis will work on machines that do and do not have an NVIDIA GPU, although it\nwill output a warning if there is no GPU.\n\nYou can find a more general tutorial on Singularity\n[here](https://github.com/bdusell/singularity-tutorial).\n\n### Additional Setup\n\nWhatever method you use to run the code (whether in a Docker container,\nSingularity container, or no container), there are some additional setup and\npreprocessing steps you need to run. The following script will take care of\nthese for you (if you are using a container, you must run this *inside the\ncontainer shell*):\n\n    $ bash scripts/setup.bash\n\nMore specifically, this script:\n\n* Installs the Python packages required by our code, which will be stored in\n  the local directory rather than system-wide. We use the package manager\n  [Poetry](https://python-poetry.org/) to manage Python packages.\n* Downloads and preprocesses the Penn Treebank language modeling dataset.\n\n## Running Code\n\nAll files under `src/` should be run using `poetry` so they have access to the\nPython packages provided by the Poetry package manager. This means you should\neither prefix all of your commands with `poetry run` or run `poetry shell`\nbeforehand to enter a shell with Poetry's virtualenv enabled all the time. You\nshould run both Python and Bash scripts with Poetry, because the Bash scripts\nmight call out to Python scripts. All Bash scripts under `src/` should be run\nwith `src/` as the current working directory.\n\nAll scripts under `scripts/` should be run with the top-level directory as the\ncurrent working directory.\n\n## Running Experiments\n\nThe [`experiments/`](experiments) directory contains scripts for reproducing\nall of the experiments and plots presented in the paper. Some of these scripts\nare intended to be used to submit jobs to a computing cluster. They should be\nrun outside of the container. You will need to edit the file\n[`experiments/submit-job.bash`](experiments/submit-job.bash)\nto tailor it to your specific computing cluster. Other scripts are for plotting\nor printing tables and should be run inside the container.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbdusell%2Fnondeterministic-stack-rnn","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbdusell%2Fnondeterministic-stack-rnn","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbdusell%2Fnondeterministic-stack-rnn/lists"}