{"id":33126549,"url":"https://github.com/ixaxaar/pytorch-dnc","last_synced_at":"2026-01-18T11:03:42.863Z","repository":{"id":50377524,"uuid":"108423989","full_name":"ixaxaar/pytorch-dnc","owner":"ixaxaar","description":"Differentiable Neural Computers, Sparse Access Memory and Sparse Differentiable Neural Computers, for Pytorch","archived":false,"fork":false,"pushed_at":"2025-06-22T14:09:46.000Z","size":885,"stargazers_count":347,"open_issues_count":7,"forks_count":59,"subscribers_count":17,"default_branch":"master","last_synced_at":"2025-11-16T01:03:40.312Z","etag":null,"topics":["differentiable-neural-computers","dnc","pytorch","rnn"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ixaxaar.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2017-10-26T14:39:56.000Z","updated_at":"2025-11-10T09:23:27.000Z","dependencies_parsed_at":"2025-02-23T09:24:37.298Z","dependency_job_id":"ec6dd94b-37ac-4094-93bd-19d876cd3216","html_url":"https://github.com/ixaxaar/pytorch-dnc","commit_stats":null,"previous_names":[],"tags_count":9,"template":false,"template_full_name":null,"purl":"pkg:github/ixaxaar/pytorch-dnc","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ixaxaar%2Fpytorch-dnc","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ixaxaar%2Fpytorch-dnc/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ixaxaar%2Fpytorch-dnc/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ixaxaar%2Fpytorch-dnc/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ixaxaar","download_url":"https://codeload.github.com/ixaxaar/pytorch-dnc/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ixaxaar%2Fpytorch-dnc/sbom","scorecard":{"id":498979,"data":{"date":"2025-08-11","repo":{"name":"github.com/ixaxaar/pytorch-dnc","commit":"cf6f6825a814921dc313619c38ad6c8fa8b538b9"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":4,"checks":[{"name":"Dangerous-Workflow","score":-1,"reason":"no workflows found","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Maintained","score":5,"reason":"5 commit(s) and 1 issue activity found in the last 90 days -- score normalized to 5","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Token-Permissions","score":-1,"reason":"No tokens found","details":null,"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"Code-Review","score":0,"reason":"Found 0/3 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"Pinned-Dependencies","score":-1,"reason":"no dependencies found","details":null,"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: MIT License: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":3,"reason":"branch protection is not maximal on development and all release branches","details":["Info: 'allow deletion' disabled on branch 'master'","Info: 'force pushes' disabled on branch 'master'","Info: 'branch protection settings apply to administrators' is required to merge on branch 'master'","Warn: 'stale review dismissal' is disabled on branch 'master'","Warn: branch 'master' does not require approvers","Warn: codeowners review is not required on branch 'master'","Warn: 'last push approval' is disabled on branch 'master'","Warn: no status checks found to merge onto branch 'master'","Info: PRs are required in order to make changes on branch 'master'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"SAST","score":0,"reason":"SAST tool is not run on all commits -- score normalized to 0","details":["Warn: 0 commits out of 30 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}},{"name":"Vulnerabilities","score":9,"reason":"1 existing vulnerabilities detected","details":["Warn: Project is vulnerable to: GHSA-887c-mr87-cxwp"],"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}}]},"last_synced_at":"2025-08-19T21:18:38.188Z","repository_id":50377524,"created_at":"2025-08-19T21:18:38.188Z","updated_at":"2025-08-19T21:18:38.188Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28535156,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-18T10:13:46.436Z","status":"ssl_error","status_checked_at":"2026-01-18T10:13:11.045Z","response_time":98,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["differentiable-neural-computers","dnc","pytorch","rnn"],"created_at":"2025-11-15T07:00:27.883Z","updated_at":"2026-01-18T11:03:42.851Z","avatar_url":"https://github.com/ixaxaar.png","language":"Python","readme":"# Differentiable Neural Computers and family, for Pytorch\n\nIncludes:\n\n1. Differentiable Neural Computers (DNC)\n2. Sparse Access Memory (SAM)\n3. Sparse Differentiable Neural Computers (SDNC)\n\n\u003c!-- START doctoc generated TOC please keep comment here to allow auto update --\u003e\n\u003c!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE --\u003e\n\n## Table of Contents\n\n- [Differentiable Neural Computers and family, for Pytorch](#differentiable-neural-computers-and-family-for-pytorch)\n  - [Table of Contents](#table-of-contents)\n  - [Install](#install)\n    - [From source](#from-source)\n  - [Architecure](#architecure)\n  - [Usage](#usage)\n    - [DNC](#dnc)\n      - [Example usage](#example-usage)\n      - [Debugging](#debugging)\n    - [SDNC](#sdnc)\n      - [Example usage](#example-usage-1)\n      - [Debugging](#debugging-1)\n    - [SAM](#sam)\n      - [Example usage](#example-usage-2)\n      - [Debugging](#debugging-2)\n  - [Tasks](#tasks)\n    - [Copy task (with curriculum and generalization)](#copy-task-with-curriculum-and-generalization)\n    - [Generalizing Addition task](#generalizing-addition-task)\n    - [Generalizing Argmax task](#generalizing-argmax-task)\n  - [Code Structure](#code-structure)\n  - [General noteworthy stuff](#general-noteworthy-stuff)\n    - [FAISS Installation Options](#faiss-installation-options)\n    - [Troubleshooting](#troubleshooting)\n\n\u003c!-- END doctoc generated TOC please keep comment here to allow auto update --\u003e\n\n[![Build Status](https://travis-ci.org/ixaxaar/pytorch-dnc.svg?branch=master)](https://travis-ci.org/ixaxaar/pytorch-dnc) [![PyPI version](https://badge.fury.io/py/dnc.svg)](https://badge.fury.io/py/dnc)\n\nThis is an implementation of [Differentiable Neural Computers](http://people.idsia.ch/~rupesh/rnnsymposium2016/slides/graves.pdf), described in the paper [Hybrid computing using a neural network with dynamic external memory, Graves et al.](https://www.nature.com/articles/nature20101)\nand Sparse DNCs (SDNCs) and Sparse Access Memory (SAM) described in [Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes](http://papers.nips.cc/paper/6298-scaling-memory-augmented-neural-networks-with-sparse-reads-and-writes.pdf).\n\n## Install\n\n```bash\npip install dnc\n```\n\n### From source\n\n```\ngit clone https://github.com/ixaxaar/pytorch-dnc\ncd pytorch-dnc\npip install -r ./requirements.txt\npip install -e .\n```\n\nFor using fully GPU based SDNCs or SAMs, install FAISS:\n\n```bash\nconda install faiss-gpu -c pytorch\n```\n\n`pytest` is required to run the test\n\n## Architecure\n\n\u003cimg src=\"./docs/dnc.png\" height=\"600\" /\u003e\n\n## Usage\n\n### DNC\n\n**Constructor Parameters**:\n\nFollowing are the constructor parameters:\n\nFollowing are the constructor parameters:\n\n| Argument            | Default  | Description                                                                     |\n| ------------------- | -------- | ------------------------------------------------------------------------------- |\n| input_size          | `None`   | Size of the input vectors                                                       |\n| hidden_size         | `None`   | Size of hidden units                                                            |\n| rnn_type            | `'lstm'` | Type of recurrent cells used in the controller                                  |\n| num_layers          | `1`      | Number of layers of recurrent units in the controller                           |\n| num_hidden_layers   | `2`      | Number of hidden layers per layer of the controller                             |\n| bias                | `True`   | Bias                                                                            |\n| batch_first         | `True`   | Whether data is fed batch first                                                 |\n| dropout             | `0`      | Dropout between layers in the controller                                        |\n| bidirectional       | `False`  | If the controller is bidirectional (Not yet implemented                         |\n| nr_cells            | `5`      | Number of memory cells                                                          |\n| read_heads          | `2`      | Number of read heads                                                            |\n| cell_size           | `10`     | Size of each memory cell                                                        |\n| nonlinearity        | `'tanh'` | If using 'rnn' as `rnn_type`, non-linearity of the RNNs                         |\n| device              | `None`   | PyTorch device object (e.g., `torch.device('cuda:0')` or `torch.device('cpu')`) |\n| independent_linears | `False`  | Whether to use independent linear units to derive interface vector              |\n| share_memory        | `True`   | Whether to share memory between controller layers                               |\n\nFollowing are the forward pass parameters:\n\n| Argument            | Default            | Description                                                      |\n| ------------------- | ------------------ | ---------------------------------------------------------------- |\n| input               | -                  | The input vector `(B*T*X)` or `(T*B*X)`                          |\n| hidden              | `(None,None,None)` | Hidden states `(controller hidden, memory hidden, read vectors)` |\n| reset_experience    | `False`            | Whether to reset memory                                          |\n| pass_through_memory | `True`             | Whether to pass through memory                                   |\n\n#### Example usage\n\n```python\nfrom dnc import DNC\n\nrnn = DNC(\n  input_size=64,\n  hidden_size=128,\n  rnn_type='lstm',\n  num_layers=4,\n  nr_cells=100,\n  cell_size=32,\n  read_heads=4,\n  batch_first=True,\n  device=torch.device('cuda:0')\n)\n\n(controller_hidden, memory, read_vectors) = (None, None, None)\n\noutput, (controller_hidden, memory, read_vectors) = \\\n  rnn(torch.randn(10, 4, 64), (controller_hidden, memory, read_vectors), reset_experience=True)\n```\n\n#### Debugging\n\nThe `debug` option causes the network to return its memory hidden vectors (numpy `ndarray`s) for the first batch each forward step.\nThese vectors can be analyzed or visualized, using visdom for example.\n\n```python\nfrom dnc import DNC\n\nrnn = DNC(\n  input_size=64,\n  hidden_size=128,\n  rnn_type='lstm',\n  num_layers=4,\n  nr_cells=100,\n  cell_size=32,\n  read_heads=4,\n  batch_first=True,\n  device=torch.device('cuda:0'),\n  debug=True\n)\n\n(controller_hidden, memory, read_vectors) = (None, None, None)\n\noutput, (controller_hidden, memory, read_vectors), debug_memory = \\\n  rnn(torch.randn(10, 4, 64), (controller_hidden, memory, read_vectors), reset_experience=True)\n```\n\nMemory vectors returned by forward pass (`np.ndarray`):\n\n| Key                             | Y axis (dimensions) | X axis (dimensions)    |\n| ------------------------------- | ------------------- | ---------------------- |\n| `debug_memory['memory']`        | layer \\* time       | nr_cells \\* cell_size  |\n| `debug_memory['link_matrix']`   | layer \\* time       | nr_cells \\* nr_cells   |\n| `debug_memory['precedence']`    | layer \\* time       | nr_cells               |\n| `debug_memory['read_weights']`  | layer \\* time       | read_heads \\* nr_cells |\n| `debug_memory['write_weights']` | layer \\* time       | nr_cells               |\n| `debug_memory['usage_vector']`  | layer \\* time       | nr_cells               |\n\n### SDNC\n\n**Constructor Parameters**:\n\nFollowing are the constructor parameters:\n\n| Argument            | Default  | Description                                                                     |\n| ------------------- | -------- | ------------------------------------------------------------------------------- |\n| input_size          | `None`   | Size of the input vectors                                                       |\n| hidden_size         | `None`   | Size of hidden units                                                            |\n| rnn_type            | `'lstm'` | Type of recurrent cells used in the controller                                  |\n| num_layers          | `1`      | Number of layers of recurrent units in the controller                           |\n| num_hidden_layers   | `2`      | Number of hidden layers per layer of the controller                             |\n| bias                | `True`   | Bias                                                                            |\n| batch_first         | `True`   | Whether data is fed batch first                                                 |\n| dropout             | `0`      | Dropout between layers in the controller                                        |\n| bidirectional       | `False`  | If the controller is bidirectional (Not yet implemented                         |\n| nr_cells            | `5000`   | Number of memory cells                                                          |\n| read_heads          | `4`      | Number of read heads                                                            |\n| sparse_reads        | `4`      | Number of sparse memory reads per read head                                     |\n| temporal_reads      | `4`      | Number of temporal reads                                                        |\n| cell_size           | `10`     | Size of each memory cell                                                        |\n| nonlinearity        | `'tanh'` | If using 'rnn' as `rnn_type`, non-linearity of the RNNs                         |\n| device              | `None`   | PyTorch device object (e.g., `torch.device('cuda:0')` or `torch.device('cpu')`) |\n| independent_linears | `False`  | Whether to use independent linear units to derive interface vector              |\n| share_memory        | `True`   | Whether to share memory between controller layers                               |\n\nFollowing are the forward pass parameters:\n\n| Argument            | Default            | Description                                                      |\n| ------------------- | ------------------ | ---------------------------------------------------------------- |\n| input               | -                  | The input vector `(B*T*X)` or `(T*B*X)`                          |\n| hidden              | `(None,None,None)` | Hidden states `(controller hidden, memory hidden, read vectors)` |\n| reset_experience    | `False`            | Whether to reset memory                                          |\n| pass_through_memory | `True`             | Whether to pass through memory                                   |\n\n#### Example usage\n\n```python\nfrom dnc import SDNC\n\nrnn = SDNC(\n  input_size=64,\n  hidden_size=128,\n  rnn_type='lstm',\n  num_layers=4,\n  nr_cells=100,\n  cell_size=32,\n  read_heads=4,\n  sparse_reads=4,\n  batch_first=True,\n  device=torch.device('cuda:0')\n)\n\n(controller_hidden, memory, read_vectors) = (None, None, None)\n\noutput, (controller_hidden, memory, read_vectors) = \\\n  rnn(torch.randn(10, 4, 64), (controller_hidden, memory, read_vectors), reset_experience=True)\n```\n\n#### Debugging\n\nThe `debug` option causes the network to return its memory hidden vectors (numpy `ndarray`s) for the first batch each forward step.\nThese vectors can be analyzed or visualized, using visdom for example.\n\n```python\nfrom dnc import SDNC\n\nrnn = SDNC(\n  input_size=64,\n  hidden_size=128,\n  rnn_type='lstm',\n  num_layers=4,\n  nr_cells=100,\n  cell_size=32,\n  read_heads=4,\n  batch_first=True,\n  sparse_reads=4,\n  temporal_reads=4,\n  device=torch.device('cuda:0'),\n  debug=True\n)\n\n(controller_hidden, memory, read_vectors) = (None, None, None)\n\noutput, (controller_hidden, memory, read_vectors), debug_memory = \\\n  rnn(torch.randn(10, 4, 64), (controller_hidden, memory, read_vectors), reset_experience=True)\n```\n\nMemory vectors returned by forward pass (`np.ndarray`):\n\n| Key                               | Y axis (dimensions) | X axis (dimensions)                                                |\n| --------------------------------- | ------------------- | ------------------------------------------------------------------ |\n| `debug_memory['memory']`          | layer \\* time       | nr_cells \\* cell_size                                              |\n| `debug_memory['visible_memory']`  | layer \\* time       | sparse_reads+2*temporal_reads+1 * nr_cells                         |\n| `debug_memory['read_positions']`  | layer \\* time       | sparse_reads+2\\*temporal_reads+1                                   |\n| `debug_memory['link_matrix']`     | layer \\* time       | sparse_reads+2*temporal_reads+1 * sparse_reads+2\\*temporal_reads+1 |\n| `debug_memory['rev_link_matrix']` | layer \\* time       | sparse_reads+2*temporal_reads+1 * sparse_reads+2\\*temporal_reads+1 |\n| `debug_memory['precedence']`      | layer \\* time       | nr_cells                                                           |\n| `debug_memory['read_weights']`    | layer \\* time       | read_heads \\* nr_cells                                             |\n| `debug_memory['write_weights']`   | layer \\* time       | nr_cells                                                           |\n| `debug_memory['usage']`           | layer \\* time       | nr_cells                                                           |\n\n### SAM\n\n**Constructor Parameters**:\n\nFollowing are the constructor parameters:\n\n| Argument            | Default  | Description                                                                     |\n| ------------------- | -------- | ------------------------------------------------------------------------------- |\n| input_size          | `None`   | Size of the input vectors                                                       |\n| hidden_size         | `None`   | Size of hidden units                                                            |\n| rnn_type            | `'lstm'` | Type of recurrent cells used in the controller                                  |\n| num_layers          | `1`      | Number of layers of recurrent units in the controller                           |\n| num_hidden_layers   | `2`      | Number of hidden layers per layer of the controller                             |\n| bias                | `True`   | Bias                                                                            |\n| batch_first         | `True`   | Whether data is fed batch first                                                 |\n| dropout             | `0`      | Dropout between layers in the controller                                        |\n| bidirectional       | `False`  | If the controller is bidirectional (Not yet implemented                         |\n| nr_cells            | `5000`   | Number of memory cells                                                          |\n| read_heads          | `4`      | Number of read heads                                                            |\n| sparse_reads        | `4`      | Number of sparse memory reads per read head                                     |\n| cell_size           | `10`     | Size of each memory cell                                                        |\n| nonlinearity        | `'tanh'` | If using 'rnn' as `rnn_type`, non-linearity of the RNNs                         |\n| device              | `None`   | PyTorch device object (e.g., `torch.device('cuda:0')` or `torch.device('cpu')`) |\n| independent_linears | `False`  | Whether to use independent linear units to derive interface vector              |\n| share_memory        | `True`   | Whether to share memory between controller layers                               |\n\nFollowing are the forward pass parameters:\n\n| Argument            | Default            | Description                                                      |\n| ------------------- | ------------------ | ---------------------------------------------------------------- |\n| input               | -                  | The input vector `(B*T*X)` or `(T*B*X)`                          |\n| hidden              | `(None,None,None)` | Hidden states `(controller hidden, memory hidden, read vectors)` |\n| reset_experience    | `False`            | Whether to reset memory                                          |\n| pass_through_memory | `True`             | Whether to pass through memory                                   |\n\n#### Example usage\n\n```python\nfrom dnc import SAM\n\nrnn = SAM(\n  input_size=64,\n  hidden_size=128,\n  rnn_type='lstm',\n  num_layers=4,\n  nr_cells=100,\n  cell_size=32,\n  read_heads=4,\n  sparse_reads=4,\n  batch_first=True,\n  device=torch.device('cuda:0')\n)\n\n(controller_hidden, memory, read_vectors) = (None, None, None)\n\noutput, (controller_hidden, memory, read_vectors) = \\\n  rnn(torch.randn(10, 4, 64), (controller_hidden, memory, read_vectors), reset_experience=True)\n```\n\n#### Debugging\n\nThe `debug` option causes the network to return its memory hidden vectors (numpy `ndarray`s) for the first batch each forward step.\nThese vectors can be analyzed or visualized, using visdom for example.\n\n```python\nfrom dnc import SAM\n\nrnn = SAM(\n  input_size=64,\n  hidden_size=128,\n  rnn_type='lstm',\n  num_layers=4,\n  nr_cells=100,\n  cell_size=32,\n  read_heads=4,\n  batch_first=True,\n  sparse_reads=4,\n  device=torch.device('cuda:0'),\n  debug=True\n)\n\n(controller_hidden, memory, read_vectors) = (None, None, None)\n\noutput, (controller_hidden, memory, read_vectors), debug_memory = \\\n  rnn(torch.randn(10, 4, 64), (controller_hidden, memory, read_vectors), reset_experience=True)\n```\n\nMemory vectors returned by forward pass (`np.ndarray`):\n\n| Key                              | Y axis (dimensions) | X axis (dimensions)                        |\n| -------------------------------- | ------------------- | ------------------------------------------ |\n| `debug_memory['memory']`         | layer \\* time       | nr_cells \\* cell_size                      |\n| `debug_memory['visible_memory']` | layer \\* time       | sparse_reads+2*temporal_reads+1 * nr_cells |\n| `debug_memory['read_positions']` | layer \\* time       | sparse_reads+2\\*temporal_reads+1           |\n| `debug_memory['read_weights']`   | layer \\* time       | read_heads \\* nr_cells                     |\n| `debug_memory['write_weights']`  | layer \\* time       | nr_cells                                   |\n| `debug_memory['usage']`          | layer \\* time       | nr_cells                                   |\n\n## Tasks\n\n### Copy task (with curriculum and generalization)\n\nThe copy task, as descibed in the original paper, is included in the repo.\n\nFrom the project root:\n\n```bash\npython ./tasks/copy_task.py -cuda 0 -optim rmsprop -batch_size 32 -mem_slot 64 # (like original implementation)\n\npython ./tasks/copy_task.py -cuda 0 -lr 0.001 -rnn_type lstm -nlayer 1 -nhlayer 2 -dropout 0 -mem_slot 32 -batch_size 1000 -optim adam -sequence_max_length 8 # (faster convergence)\n\nFor SDNCs:\npython ./tasks/copy_task.py -cuda 0 -lr 0.001 -rnn_type lstm -memory_type sdnc -nlayer 1 -nhlayer 2 -dropout 0 -mem_slot 100 -mem_size 10  -read_heads 1 -sparse_reads 10 -batch_size 20 -optim adam -sequence_max_length 10\n\nand for curriculum learning for SDNCs:\npython ./tasks/copy_task.py -cuda 0 -lr 0.001 -rnn_type lstm -memory_type sdnc -nlayer 1 -nhlayer 2 -dropout 0 -mem_slot 100 -mem_size 10  -read_heads 1 -sparse_reads 4 -temporal_reads 4 -batch_size 20 -optim adam -sequence_max_length 4 -curriculum_increment 2 -curriculum_freq 10000\n```\n\nFor the full set of options, see:\n\n```\npython ./tasks/copy_task.py --help\n```\n\nThe copy task can be used to debug memory using [Visdom](https://github.com/facebookresearch/visdom).\n\nAdditional step required:\n\n```bash\npip install visdom\npython -m visdom.server\n```\n\nOpen http://localhost:8097/ on your browser, and execute the copy task:\n\n```bash\npython ./tasks/copy_task.py -cuda 0\n```\n\nThe visdom dashboard shows memory as a heatmap for batch 0 every `-summarize_freq` iteration:\n\n![Visdom dashboard](./docs/dnc-mem-debug.png)\n\n### Generalizing Addition task\n\nThe adding task is as described in [this github pull request](https://github.com/Mostafa-Samir/DNC-tensorflow/pull/4#issue-199369192).\nThis task\n\n- creates one-hot vectors of size `input_size`, each representing a number\n- feeds a sentence of them to a network\n- the output of which is added to get the sum of the decoded outputs\n\nThe task first trains the network for sentences of size ~100, and then tests if the network genetalizes for lengths ~1000.\n\n```bash\npython ./tasks/adding_task.py -cuda 0 -lr 0.0001 -rnn_type lstm -memory_type sam -nlayer 1 -nhlayer 1 -nhid 100 -dropout 0 -mem_slot 1000 -mem_size 32 -read_heads 1 -sparse_reads 4 -batch_size 20 -optim rmsprop -input_size 3 -sequence_max_length 100\n```\n\n### Generalizing Argmax task\n\nThe second adding task is similar to the first one, except that the network's output at the last time step is expected to be the argmax of the input.\n\n```bash\npython ./tasks/argmax_task.py -cuda 0 -lr 0.0001 -rnn_type lstm -memory_type dnc -nlayer 1 -nhlayer 1 -nhid 100 -dropout 0 -mem_slot 100 -mem_size 10 -read_heads 2 -batch_size 1 -optim rmsprop -sequence_max_length 15 -input_size 10 -iterations 10000\n```\n\n## Code Structure\n\n1. DNCs:\n\n- [dnc/dnc.py](dnc/dnc.py) - Controller code.\n- [dnc/memory.py](dnc/memory.py) - Memory module.\n\n2. SDNCs:\n\n- [dnc/sdnc.py](dnc/sdnc.py) - Controller code, inherits [dnc.py](dnc/dnc.py).\n- [dnc/sparse_temporal_memory.py](dnc/sparse_temporal_memory.py) - Memory module.\n\n3. SAMs:\n\n- [dnc/sam.py](dnc/sam.py) - Controller code, inherits [dnc.py](dnc/dnc.py).\n- [dnc/sparse_memory.py](dnc/sparse_memory.py) - Memory module.\n\n4. Tests:\n\n- All tests are in [./tests](./tests) folder.\n\n## General noteworthy stuff\n\n### FAISS Installation Options\n\nFAISS can be installed in two ways:\n\n1. Using conda (quickest for most users):\n\n```bash\nconda install faiss-gpu -c pytorch\n```\n\n2. Using the custom build script (for better CUDA integration):\n\n```bash\n# Navigate to the scripts/faiss_build directory\ncd scripts/faiss_build\n# Run the build script (builds FAISS with CUDA and cuBLAS support)\n./build_faiss.sh\n```\n\nThe custom build script will compile FAISS with CUDA and cuBLAS support directly into your virtual environment, providing better performance for GPU-accelerated sparse memory operations.\n\nFAISS is much faster, has a GPU implementation, and is interoperable with PyTorch tensors. Recent updates have improved CUDA support for better performance on GPU hardware.\n\n### Troubleshooting\n\n1. `nan`s in the gradients are common, try with different batch sizes\n2. If you encounter CUDA-related issues with FAISS, try using the custom build script mentioned above\n3. Recent bug fixes have addressed several stability issues\n\nRepos referred to for creation of this repo:\n\n- [deepmind/dnc](https://github.com/deepmind/dnc)\n- [ypxie/pytorch-NeuCom](https://github.com/ypxie/pytorch-NeuCom)\n- [jingweiz/pytorch-dnc](https://github.com/jingweiz/pytorch-dnc)\n","funding_links":[],"categories":["Paper implementations｜论文实现","Paper implementations","Python"],"sub_categories":["Other libraries｜其他库:","Other libraries:"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fixaxaar%2Fpytorch-dnc","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fixaxaar%2Fpytorch-dnc","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fixaxaar%2Fpytorch-dnc/lists"}