{"id":16714279,"url":"https://github.com/kinggerm/arachis","last_synced_at":"2025-04-10T06:12:35.481Z","repository":{"id":128922026,"uuid":"87224073","full_name":"Kinggerm/Arachis","owner":"Kinggerm","description":null,"archived":false,"fork":false,"pushed_at":"2020-08-23T05:58:49.000Z","size":105,"stargazers_count":7,"open_issues_count":1,"forks_count":3,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-03-24T07:22:16.379Z","etag":null,"topics":["circular-permutation","genome-rearrangment","grimm","permutations","plastome","signed-permutation"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Kinggerm.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-04-04T18:56:27.000Z","updated_at":"2024-11-25T10:52:33.000Z","dependencies_parsed_at":"2023-05-31T21:00:44.276Z","dependency_job_id":null,"html_url":"https://github.com/Kinggerm/Arachis","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Kinggerm%2FArachis","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Kinggerm%2FArachis/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Kinggerm%2FArachis/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Kinggerm%2FArachis/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Kinggerm","download_url":"https://codeload.github.com/Kinggerm/Arachis/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248166925,"owners_count":21058481,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["circular-permutation","genome-rearrangment","grimm","permutations","plastome","signed-permutation"],"created_at":"2024-10-12T21:04:14.622Z","updated_at":"2025-04-10T06:12:35.469Z","avatar_url":"https://github.com/Kinggerm.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Arachis\n\n## Introduction\n\nArachis is a Python library for analyzing genome rearrangements. It allows users to reconstruct ancestral genome \ngene orders and infer pairwise genome differences or events.\n\n\n## Algorithms \u0026 Features\n\nThe algorithm for reconstructing ancestral genome gene orders implemented in the script file `run_pypmag.py`\nis derived from the ancestral gene order reconstruction module of \u003ca href=\"#PMAG\"\u003e`PMAG+`\u003c/a\u003e, with modifications:\n\n1. Circular and gap-containing genomes is allowed as inputs. See modifications on GRIMM below.\n2. Equipped with python multiprocessing.\n3. More flexible in input data format (both tree and GRIMM).\n\nThis library defines a new version of [the classic GRIMM format](http://grimm.ucsd.edu/GRIMM/grimm_instr.html) \nwith following modifications:\n1. Blocks could be named with letters in \"`-.0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz|~[]`\". \n\"`-`\" still means a reverse direction when it appears at the first letter.\n2. The \"`*`\" block in a sequence means a gap. `1 2 3 * 4 5 * 6 $` is equivalent to `1 2 3 * -5 -4 * 6 $`.\n3. A sequence line without the \"`$`\" block in the end stands for a circular chromosome. A circular sequence\n`a b c d e` is equivalent to `b c d e a`, also equivalent to `-e -d -c -b -a`. But in this case only \none chromosome per sample is allowed, and multiple lines without \"$\" would be regarded as one single chromosome \nwritten interleaved. This design is due to the limitation of applying to tsp solver.\n\nThe functions in Arachis for inferring pairwise genome differences or events are still at \u003cb\u003einfant\u003c/b\u003e stage. \nIf you find any bug or something to improve, please contact \n[jinjianjun@mail.kib.ac.cn](mailto:jinjianjun@mail.kib.ac.cn). New contributors are welcome! \nAlso, users have to bear in mind that do not test data with too many breakpoints (like 10+).\nAt this stage, the function Chromosome.inversion_event_from utilizes an \u003cb\u003eexhausted\u003c/b\u003e scheme searching \nfor one best solution. Currently, I am using it to play with highly rearranged plastome data of legumes.\nIt's worth trying more small permutations, like some plant mitochondrial data.\n\n## Installation\n\nDownload Arachis and install Arachis with:\n\n    $ git clone \"https://github.com/Kinggerm/Arachis\"\n    $ cd Arachis\n    $ python setup.py install\n\nTo further use `run_pypmag.py` to reconstruct ancestral genome gene order, you have to install following dependencies:\n\n* \u003cb\u003eDendroPy\u003c/b\u003e The tree parser in Arachis. Get it [here](https://www.dendropy.org/#installing).\n* \u003cb\u003eRAxML\u003c/b\u003e The reconstruction engine in the algorithm of PMAG+. \nThe single thread version is preferred. Get it [here](https://github.com/stamatak/standard-RAxML).\n* \u003cb\u003eConcorde\u003c/b\u003e The TSP (Traveling Salesman Problem) Solver in the algorithm of PMAG+. \nGet it [here](http://www.math.uwaterloo.ca/tsp/concorde/downloads/downloads.htm).\n\n## Example\n\n* To check whether two circular permutations, `-e -d -c -b -a` and `a b c d e`, are equivalent:\n\n    ```py\n        # open python shell\n        \u003e\u003e\u003e from arachis.genomeClass import Chromosome\n        \u003e\u003e\u003e seq1 = Chromosome(\"-e -d -c -b -a\")\n        \u003e\u003e\u003e seq2 = Chromosome(\"a b c d e\")\n        \u003e\u003e\u003e seq1 == seq2\n    ```\n     \u003cpre\u003e    \u003cfont color=grey\u003eTrue\u003c/font\u003e\u003c/pre\u003e\n\n* If you want to see how many flip-flop configurations (isomers) could be induced by several groups of inverted repeats, \nor in another similar case, to see how many reasonable paths are there in a complicated assembly graph with repeats that\ncould not be unfolded by short seq-library, try this:\n \n     ```py\n        # open python shell\n        \u003e\u003e\u003e from arachis.genomeClass import Chromosome\n        \u003e\u003e\u003e Picea = Chromosome(\"1 2 12 14 13 2 3 4 10 8 15 14 11 4 5 6 7 8 9 -6\")\n        \u003e\u003e\u003e isomers, changes = Picea.get_isomers()\n        \u003e\u003e\u003e print(len(isomers))\n     ```\n     ```\n        14\n     ```\n     ```py\n        \u003e\u003e\u003e for isomer in isomers:\n                print(isomer)\n     ```\n     ```\n        1 2 12 14 13 2 3 4 10 8 15 14 11 4 5 6 -9 -8 -7 -6\n        1 2 12 14 13 2 3 4 10 8 9 -6 -5 -4 -11 -14 -15 -8 -7 -6\n        1 2 12 14 11 4 5 6 -9 -8 -10 -4 -3 -2 -13 -14 -15 -8 -7 -6\n        1 2 12 14 13 2 3 4 5 6 -9 -8 -10 -4 -11 -14 -15 -8 -7 -6\n        1 2 3 4 10 8 9 -6 -5 -4 -11 -14 -12 -2 -13 -14 -15 -8 -7 -6\n        1 2 12 14 11 4 10 8 9 -6 -5 -4 -3 -2 -13 -14 -15 -8 -7 -6\n        1 2 12 14 11 4 5 6 7 8 15 14 13 2 3 4 10 8 9 -6\n        1 2 12 14 13 2 3 4 5 6 7 8 15 14 11 4 10 8 9 -6\n        1 2 3 4 5 6 -9 -8 -10 -4 -11 -14 -12 -2 -13 -14 -15 -8 -7 -6\n        1 2 3 4 10 8 15 14 13 2 12 14 11 4 5 6 -9 -8 -7 -6\n        1 2 12 14 11 4 10 8 15 14 13 2 3 4 5 6 -9 -8 -7 -6\n        1 2 3 4 5 6 7 8 15 14 13 2 12 14 11 4 10 8 9 -6\n        1 2 3 4 10 8 15 14 13 2 12 14 11 4 5 6 7 8 9 -6\n        1 2 12 14 11 4 10 8 15 14 13 2 3 4 5 6 7 8 9 -6\n    ```\n\n* Run `run_pypmag.py` to reconstruct ancestral genome gene order of test data:\n\n        run_pypmag.py -d test/test_1_grimm.txt -t test/test_1_rooted.tre -o test/test_1_output --seed 12345\n    \n* To see parsimonious events along the branch from `A1` to `sp2` in above test_1 running results:\n\n     ```py\n        # open python shell\n        \u003e\u003e\u003e from arachis.genomeClass import GenomeList\n        \u003e\u003e\u003e extant_samples = GenomeList(\"test/test_1_grimm.txt\")\n        \u003e\u003e\u003e sp2 = extant_samples[\"sp2\"].chromosomes()[0]\n        \u003e\u003e\u003e ancestors = GenomeList(\"test/test_1_output/OutputGeneOrder\")\n        \u003e\u003e\u003e A1 = ancestors[\"A1\"].chromosomes()[0]\n        \u003e\u003e\u003e events = sp2.event_from(A1)\n     ```\n     ```\n                Breakpoints: 2\n                    Round 1: inherited combinations: 1; inversion sites:  2; time: 0.0002s; memory: 0.01G\n                Inversions: 1 + 0(iso)\n                Total inversion time: 0.0006s\n     ```\n\n## Citation\n\nIf you use Arachis in your research, you could cite Arachis as:\n* Jian-Jun Jin. 2018. Arachis: a Python library for analysing genome rearrangements. https://www.github.org/Kinggerm/ARACHIS\n\nIf you use `run_pypmag.py`, please cite following papers:\n\u003cdiv id=\"PMAG\"\u003e\u003c/div\u003e\n\n* \u003cb\u003ePMAG+\u003c/b\u003e Hu, F., Zhou, J., Zhou, L., \u0026 Tang, J. 2014. Probabilistic Reconstruction of Ancestral Gene Orders with Insertions and Deletions. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 11(4), 667–672. \u003chttp://doi.org/10.1109/TCBB.2014.2309602\u003e\n* \u003cb\u003eRAxML\u003c/b\u003e Stamatakis, A. 2014. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics, 30(9), 1312–1313. \u003chttp://doi.org/10.1093/bioinformatics/btu033\u003e\n* \u003cb\u003eConcorde\u003c/b\u003e D. Applegate, R. Bixby, V. Chvatal, \u0026 W. Cook. 2011. “Concorde TSP Solver,” http://www.math.uwaterloo.ca/tsp/concorde/\n* \u003cb\u003eDendroPy\u003c/b\u003e Sukumaran, J., \u0026 Holder, M. T. 2010. DendroPy: a Python library for phylogenetic computing. Bioinformatics, 26(12), 1569–1571. \u003chttp://doi.org/10.1093/bioinformatics/btq228\u003e\n\n## Acknowledgement\n\nI thank [Stephen Smith](https://github.com/blackrim), [Joseph Brown](https://github.com/josephwb), and [Caroline Parins-Fukuchi](https://github.com/carolinetomo) for discussions.\n\n## License\nGNU General Public License, version 3","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkinggerm%2Farachis","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkinggerm%2Farachis","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkinggerm%2Farachis/lists"}