{"id":22886612,"url":"https://github.com/zenodeapp/substitution-matrices","last_synced_at":"2025-03-31T18:45:30.200Z","repository":{"id":61710346,"uuid":"554345279","full_name":"zenodeapp/substitution-matrices","owner":"zenodeapp","description":"A CRUD for substitution matrices like BLOSUM50, BLOSUM62, PAM250 and more; commonly used in Bioinformatics and Evolutionary Biology.","archived":false,"fork":false,"pushed_at":"2023-12-07T20:05:51.000Z","size":131,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-06T23:19:15.368Z","etag":null,"topics":["bioinformatics","crud","hardhat","matrix","nucleotide-sequences","protein-sequences","solidity","substitution-matrices"],"latest_commit_sha":null,"homepage":"","language":"Solidity","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/zenodeapp.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-10-19T16:53:19.000Z","updated_at":"2022-10-23T06:54:24.000Z","dependencies_parsed_at":"2023-01-21T03:16:47.294Z","dependency_job_id":null,"html_url":"https://github.com/zenodeapp/substitution-matrices","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zenodeapp%2Fsubstitution-matrices","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zenodeapp%2Fsubstitution-matrices/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zenodeapp%2Fsubstitution-matrices/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zenodeapp%2Fsubstitution-matrices/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/zenodeapp","download_url":"https://codeload.github.com/zenodeapp/substitution-matrices/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246523089,"owners_count":20791431,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bioinformatics","crud","hardhat","matrix","nucleotide-sequences","protein-sequences","solidity","substitution-matrices"],"created_at":"2024-12-13T20:19:30.460Z","updated_at":"2025-03-31T18:45:30.177Z","avatar_url":"https://github.com/zenodeapp.png","language":"Solidity","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Substitution Matrices\n\nA CRUD for substitution matrices like BLOSUM50, BLOSUM62, PAM250 and more; commonly used in Bioinformatics and Evolutionary Biology.\n\nThis has been built by ZENODE within the Hardhat environment and is licensed under the MIT-license (see [LICENSE.md](./LICENSE.md)).\n\n## Overview\n\n### Dependencies\n\n- `hardhat` (npm module)\n- `web3` (npm module)\n- Uses the [`zenode-contracts`](/submodules) repository, which is automatically included as a Git submodule.\n\n### Features\n\n- CRUD in Solidity; immutable code, but flexible by design.\n- Modular; loose coupling and high cohesion promote easy implementation into other contracts.\n- Re-usable; deploy only once and use in multiple contracts.\n- Ownership; access control and administrative privilege management.\n\n### Dataset\n\n- [AA](dataset/alphabets/aa.txt) (Amino acids; alphabet for Proteins)\n  - [BLOSUM50](dataset/matrices/aa/blosum50.txt)\n  - [BLOSUM62](dataset/matrices/aa/blosum62.txt)\n  - [PAM40](dataset/matrices/aa/pam40.txt)\n  - [PAM120](dataset/matrices/aa/pam120.txt)\n  - [PAM250](dataset/matrices/aa/pam250.txt)\n- [NT](dataset/alphabets/nt.txt) (Nucleotides; alphabet for DNA — also known as the 'Nucleic acid notation')\n  - [SIMPLE](dataset/matrices/nt/simple.txt)\n  - [SMART](dataset/matrices/nt/smart.txt)\n\n### Hardhat\n\n- Scripts\n  - deploy.js - deploys the contract to the configured network.\n  - insert.js - reads, parses and inserts matrices or alphabets.\n  - delete.js - deletes matrices or alphabets.\n- Tasks for contract interaction (see [6. Interaction](#6-interaction)).\n\n### AWK\n\n- Text parsers that convert matrices and alphabets into Solidity code.\n\n## Getting Started\n\n### TL;DR\n\n\u003e [`0. Clone`](#0-clone) \u003ci\u003e--use the --recursive flag.\u003c/i\u003e\n\u003e\n\u003e ```\n\u003e git clone --recursive https://github.com/zenodeapp/substitution-matrices.git \u003cdestination_folder\u003e\n\u003e ```\n\u003e\n\u003e [`1. Installation`](#1-installation) \u003ci\u003e--use npm, yarn or any other package manager.\u003c/i\u003e\n\u003e\n\u003e ```\n\u003e npm install\n\u003e ```\n\u003e\n\u003e ```\n\u003e yarn install\n\u003e ```\n\u003e\n\u003e [`2. Run the test node`](#2-configure-and-run-your-test-node) \u003ci\u003e--do this in a separate terminal!\u003c/i\u003e\n\u003e\n\u003e ```script\n\u003e npx hardhat node\n\u003e ```\n\u003e\n\u003e [`3. Deployment`](#3-deployment)\n\u003e\n\u003e ```\n\u003e npx hardhat run scripts/deploy.js\n\u003e ```\n\u003e\n\u003e [`4. Configuration`](#4-configuration) \u003ci\u003e--add the contract address to [zenode.config.js](/zenode.config.js)\u003c/i\u003e.\n\u003e\n\u003e ```javascript\n\u003e ...\n\u003e contracts: {\n\u003e   substitutionMatrices: {\n\u003e     name: \"SubstitutionMatrices\",\n\u003e     address: \"ADD_YOUR_CONTRACT_ADDRESS_HERE\",\n\u003e   },\n\u003e },\n\u003e ...\n\u003e ```\n\u003e\n\u003e [`5. Population`](#5-population)\n\u003e\n\u003e ```\n\u003e npx hardhat run scripts/alphabets/insert.js\n\u003e ```\n\u003e\n\u003e ```\n\u003e npx hardhat run scripts/matrices/insert.js\n\u003e ```\n\u003e\n\u003e [`6. Interaction`](#6-interaction) \u003ci\u003e--use the scripts provided in the [Interaction](#6-interaction) phase.\u003c/i\u003e\n\n### 0. Clone\n\nTo get started, clone the repository with the `--recursive` flag:\n\n```\ngit clone --recursive https://github.com/zenodeapp/substitution-matrices.git \u003cdestination_folder\u003e\n```\n\n\u003e This repository includes submodules and should thus contain the `--recursive` flag.\n\n\u003cbr\u003e\n\nIf you've already downloaded, forked or cloned this repository without including the `--recursive` flag, then run this command from the root folder:\n\n```\ngit submodule update --init --recursive\n```\n\n\u003e Read more on how to work with `submodules` in the [zenode-contracts](https://github.com/zenodeapp/zenode-contracts) repository.\n\n### 1. Installation\n\nInstall all dependencies using a package manager of your choosing:\n\n```\nnpm install\n```\n\n```\nyarn install\n```\n\n### 2. Configure and run your (test) node\n\nAfter having installed all dependencies, use:\n\n```script\nnpx hardhat node\n```\n\n\u003e Make sure to do this in a separate terminal!\n\n\u003cbr\u003e\n\nThis will create a test environment where we can deploy our contract(s) to. By default, this repository is configured to Hardhat's local test node, but can be changed in the [hardhat.config.js](/hardhat.config.js) file. For more information on how to do this, see [Hardhat's documentation](https://hardhat.org/hardhat-runner/docs/config).\n\n### 3. Deployment\n\nNow that our node is up-and-running, we can deploy our contract using:\n\n```\nnpx hardhat run scripts/deploy.js\n```\n\n\u003e You should see a message appear in your terminal, stating that the contract was deployed successfully.\n\n### 4. Configuration\n\nOur CRUD is deployed, but doesn't contain any data whatsoever. Before we go ahead and populate it with alphabets and matrices, we'll have to make a couple of changes to the [zenode.config.js](zenode.config.js) file.\n\n#### 4.1 Link contract address (required)\n\nWe add the address of our contract to the `contracts` object. That way it knows which deployed contract it should interact with.\n\n```javascript\n...\ncontracts: {\n  substitutionMatrices: {\n    name: \"SubstitutionMatrices\",\n    address: \"ADD_YOUR_CONTRACT_ADDRESS_HERE\",\n  },\n},\n...\n```\n\n\u003e The contract address can be found in your terminal after deployment.\n\n#### 4.2 Editing insertions/deletions (Optional)\n\nBy default, all known alphabets and matrices will be inserted upon running the `insert.js` scripts (in the [Population](#5-population) phase).\n\nIf you would like to change this behavior, edit the following key-value pairs:\n\n```javascript\n{\n  // You could also pass in a string instead of an array\n  alphabetsToInsert: [\"ALPHABET_ID_1\", \"ALPHABET_ID_2\", ...],\n  matricesToInsert: [\"MATRIX_ID_1\", \"MATRIX_ID_2\", ...],\n}\n```\n\nand for the `delete.js` scripts:\n\n```javascript\n{\n  alphabetsToDelete: [\"ALPHABET_ID_1\", \"ALPHABET_ID_2\", ...],\n  matricesToDelete: [\"MATRIX_ID_1\", \"MATRIX_ID_2\", ...],\n}\n```\n\n\u003e NOTE: `ID`s are only valid if they are present in the `alphabets` or `matrices` objects (see [4.3](#43-adding-new-alphabetsmatrices-optional)).\n\n#### 4.3 Adding new alphabets/matrices (Optional)\n\nThere are two steps to consider when adding new alphabets or matrices, namely:\n\n1. The creation of the actual file that represents our new dataset, and\n2. Creating a reference to this dataset in [zenode.config.js](/zenode.config.js).\n\nFor step one it's important to know what data our text parser expects. For this it might be best to look at the files we've already included in the [dataset](/dataset) folder. I also suggest to read more about the formatting of `Alphabets and Matrices` in the [Appendix](#a-alphabets-and-matrices).\n\nFor the second step we add our new dataset to one of the following objects:\n\n\u003cb\u003e`alphabets`\u003c/b\u003e\n\n```javascript\nalphabets: {\n  ALPHABET_ID_1: \"ALPHABET_ID_1_RELATIVE_PATH\",\n  ALPHABET_ID_2: \"ALPHABET_ID_2_RELATIVE_PATH\",\n  ...\n},\n```\n\nor \u003cb\u003e`matrices`\u003c/b\u003e\n\n```javascript\nmatrices: {\n  MATRIX_ID_1: {\n    alphabet: \"ALPHABET_ID_2\",\n    file: \"MATRIX_ID_1_RELATIVE_PATH\",\n  },\n  MATRIX_ID_2: {\n    alphabet: \"ALPHABET_ID_1\",\n    file: \"MATRIX_ID_2_RELATIVE_PATH\",\n  },\n  ...\n},\n```\n\n##### 4.3.1 Remarks\n\n- The `alphabets`-object only requires an `ID` and `RELATIVE_PATH`.\n- The `matrices`-object on the other hand also requires you to add an `ALPHABET_ID`.\n- The `IDs` can be used in `alphabetsToInsert`, `alphabetsToDelete`, `matricesToInsert` and `matricesToDelete` (see [4.2](#42-editing-insertionsdeletions-optional)).\n\n##### 4.3.2 Examples\n\n- `alphabet amino_acids` (protein sequence characters):\n\n  ```javascript\n  alphabets: {\n    amino_acids: \"dataset/alphabets/aa.txt\",\n  }\n  ```\n\n- `matrix blosum100` using `alphabet amino_acids`:\n  ```javascript\n  matrices: {\n    blosum100: {\n      alphabet: \"amino_acids\",\n      file: \"dataset/matrices/blosum100.txt\",\n    },\n  }\n  ```\n  \u003cbr\u003e\n\n\u003e IMPORTANT: adding a new alphabet or matrix doesn't mean it gets inserted into the contract in the [Population](#5-population) phase. For this it has to be included in the `alphabetsToInsert` or `matricesToInsert` key-value pair! (see [4.2](#42-editing-insertionsdeletions-optional))\n\n### 5. Population\n\nNow that we've deployed our contract and configured our setup, we can start populating our CRUD with alphabets and matrices!\n\n#### 5.1 Insertion\n\nTo insert all the alphabets/matrices you've configured in the key-value pair `alphabetsToInsert`/`matricesToInsert` use:\n\n```\nnpx hardhat run scripts/alphabets/insert.js\n```\n\n```\nnpx hardhat run scripts/matrices/insert.js\n```\n\n\u003e NOTE: you cannot insert a matrix before having inserted the alphabet it belongs to!\n\n#### 5.2 Deletion\n\nTo delete all the alphabets/matrices you've configured in the key-value pair `alphabetsToDelete`/`matricesToDelete` use:\n\n```\nnpx hardhat run scripts/alphabets/delete.js\n```\n\n```\nnpx hardhat run scripts/matrices/delete.js\n```\n\n### 6. Interaction\n\nDeployed, populated and ready to explore!\n\n\u003cbr\u003e\n\nHere are a few Hardhat tasks (written in [hardhat.config.js](/hardhat.config.js)) to test our contract with:\n\n\u003cul\u003e\n\u003cli\u003e\n\n\u003cb\u003egetScore\u003c/b\u003e\n\nGet the alignment score of two characters based on the given substitution matrix.\n\n- `input:` `--matrix string` `--a char` `--b char`\n\n- `output:` `int`\n\n```\nnpx hardhat getScore --matrix \"MATRIX_ID\" --a \"SINGLE_CHAR_A\" --b \"SINGLE_CHAR_B\"\n```\n\n\u003c/li\u003e\n\n\u003cli\u003e\n\n\u003cb\u003egetAlphabet\u003c/b\u003e\n\nReturns an alphabet-object based on the given ALPHABET_ID.\n\n- `input:` `--id string`\n\n- `output:` `struct Alphabet` \u003ci\u003e--see [libraries/Structs.sol](/libraries/Structs.sol)\u003c/i\u003e\n\n```\nnpx hardhat getAlphabet --id \"ALPHABET_ID\"\n```\n\n\u003c/li\u003e\n\u003cli\u003e\n\n\u003cb\u003egetMatrix\u003c/b\u003e\n\nReturns a matrix-object based on the given MATRIX_ID.\n\n- `input:` `--id string`\n\n- `output:` `struct Matrix` \u003ci\u003e--see [libraries/Structs.sol](/libraries/Structs.sol)\u003c/i\u003e\n\n```\nnpx hardhat getMatrix --id \"MATRIX_ID\"\n```\n\n\u003c/li\u003e\n\u003cli\u003e\n\n\u003cb\u003egetAlphabets\u003c/b\u003e\n\nReturns the list of inserted ALPHABET_IDs.\n\n- `input:` `null`\n\n- `output:` `string[]`\n\n```\nnpx hardhat getAlphabets\n```\n\n\u003c/li\u003e\n\u003cli\u003e\n\n\u003cb\u003egetMatrices\u003c/b\u003e\n\nReturns the list of inserted MATRIX_IDs.\n\n- `input:` `null`\n\n- `output:` `string[]`\n\n```\nnpx hardhat getMatrices\n```\n\n\u003c/li\u003e\n\u003c/ul\u003e\n\n## Appendix\n\n### A. [Alphabets and Matrices](/dataset)\n\n`Alphabets` and `Matrices` are the two main components of the `SubstitutionMatrices` contract. Alphabets include but are not limited to nucleotide and protein sequence characters (e.g. C, T, A and G), while matrices are 2-dimensional scoring grids (e.g. BLOSUM62, PAM40, PAM120, etc.). To get a better (visual) understanding, you should check out the alphabets and matrices included in the [dataset](/dataset) folder.\n\nThese components are simple .txt files that abide by the following formatting rules:\n\n- An `alphabet` is a single line of characters, where \u003cb\u003ethe position of a character represents its numeric value\u003c/b\u003e.\n- A `matrix` is a 2-dimensional grid, where the \u003ci\u003efirst row\u003c/i\u003e and \u003ci\u003efirst column\u003c/i\u003e consist of \u003ci\u003eonly-alphabetical\u003c/i\u003e characters.\n- The remaining positions of a `matrix` are integers (zero, negative or positive).\n- \u003cb\u003eThe order of the \u003ci\u003ealphabetical\u003c/i\u003e characters inside a `matrix` should be the same as the `alphabet` it belongs to (horizontally and vertically)\u003c/b\u003e.\n- Every \u003ci\u003ealphanumerical\u003c/i\u003e character, for both `alphabet` and `matrix`, is delimited by whitespaces.\n\n### B. [zenode.config.js](/zenode.config.js)\n\nThis is where most of the \u003ci\u003epersonalization\u003c/i\u003e for contract deployment and filling takes place.\n\nIn the case of the `substitution-matrices` repository this includes:\n\n- Choosing which alphabets/matrices get inserted or deleted in the [Population](#5-population) phase.\n- Configuring which contract we'll interact with in the [Interaction](#6-interaction) phase.\n- Expanding (or shrinking for that matter) the list of known alphabets and matrices.\n\n## Credits\n\n- Hardhat's infrastructure! (https://hardhat.org/)\n\n\u003c/br\u003e\n\n\u003cp align=\"right\"\u003e— ZEN\u003c/p\u003e\n\u003cp align=\"right\"\u003eCopyright (c) 2022 ZENODE\u003c/p\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzenodeapp%2Fsubstitution-matrices","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzenodeapp%2Fsubstitution-matrices","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzenodeapp%2Fsubstitution-matrices/lists"}