{"id":23505669,"url":"https://github.com/gamcil/synthaser","last_synced_at":"2025-04-16T00:59:48.577Z","repository":{"id":53029651,"uuid":"198734817","full_name":"gamcil/synthaser","owner":"gamcil","description":"Parse NCBI CD-search results to find and visualise the domain architecture of secondary metabolite synthases","archived":false,"fork":false,"pushed_at":"2022-11-09T04:09:28.000Z","size":6716,"stargazers_count":22,"open_issues_count":2,"forks_count":4,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-04-16T00:59:43.377Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gamcil.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-07-25T01:34:51.000Z","updated_at":"2025-04-10T13:08:43.000Z","dependencies_parsed_at":"2023-01-22T04:04:04.737Z","dependency_job_id":null,"html_url":"https://github.com/gamcil/synthaser","commit_stats":null,"previous_names":[],"tags_count":10,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gamcil%2Fsynthaser","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gamcil%2Fsynthaser/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gamcil%2Fsynthaser/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gamcil%2Fsynthaser/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gamcil","download_url":"https://codeload.github.com/gamcil/synthaser/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":249178209,"owners_count":21225349,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-25T09:38:52.777Z","updated_at":"2025-04-16T00:59:48.548Z","avatar_url":"https://github.com/gamcil.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# synthaser\n[![Coverage Status](https://coveralls.io/repos/github/gamcil/synthaser/badge.svg?branch=master)](https://coveralls.io/github/gamcil/synthaser?branch=master\u0026service=github)\n![Tests passing](https://github.com/gamcil/synthaser/actions/workflows/python-app.yml/badge.svg)\n[![Documentation Status](https://readthedocs.org/projects/synthaser/badge/?version=latest)](https://synthaser.readthedocs.io/en/latest/?badge=latest)\n[![PyPI version](https://badge.fury.io/py/synthaser.svg)](https://badge.fury.io/py/synthaser)\n\n## Process\n`synthaser` parses the results of a batch NCBI conserved domain search and determines\nthe domain architecture of secondary metabolite synthases.\n\n## Installation\nInstall from PyPI using pip:\n\n```sh\n$ pip install --user synthaser\n```\n\nor clone the repo and install locally:\n\n```sh\n$ git clone https://www.github.com/gamcil/synthaser\n$ cd synthaser\n$ pip install .\n```\n\nFinally, configure synthaser with your e-mail address or NCBI API key (used when making requests to NCBI servers), for example:\n\n```sh\n$ synthaser config --email your@email.com\n```\n\n## Dependencies\n`synthaser` is written in pure Python (3.6+), and requires only the following dependencies for\nremote searches:\n- `requests`, for interaction with the NCBI's CD-Search API\n- `biopython`, for retrieving sequences from NCBI Entrez\n\nIf you want to do local searches, you'll need:\n- `RPS-BLAST`, for performing local domain searches\n- `rpsbproc`, for formatting RPS-BLAST results like CD-Search\n\nThese can be obtained from the [NCBI FTP](ftp://ftp.ncbi.nih.gov/pub/mmdb/cdd/rpsbproc/).\n\n## Usage\nA full `synthaser` search can be performed as simply as:\n\n```sh\n$ synthaser search -qf sequences.fasta\n```\n\nWhere `sequences.fasta` is a FASTA format file containing the protein sequences\nthat you would like to search.\n\nFor a full listing of available arguments, enter:\n\n```sh\n$ synthaser -h\n```\n\n### Visualising your results\n`synthaser` is capable of generating fully-interactive, annotated visualisations\nso you can easily explore your results. All that is required is one\nextra argument:\n\n```sh\n$ synthaser search -qf sequences.fasta -p\n```\n\nThis will generate a figure like so:\n\n\u003cimg src=\"./img/anid_pks.png\"\n\twidth=\"400\"\n\talt=\"Example synthaser output\"\u003e\n\n[Click here](https://synthaser.readthedocs.io/en/latest/_static/anid.html) to play around with the full version of this example.\n\n### Saving your search session\n`synthaser` allows you to save your search results such that they can be easily\nreloaded for further visualisation or exploration without having to fully re-do\nthe search.\n\nTo do this, use the `--json_file` command:\n\n```sh\n$ synthaser search -qf sequences.fasta --json_file sequences.json\n```\n\nThis will save all of your results, in JSON format, to the file\n`sequences.json`. Then, loading this session back into `synthaser`, is as easy\nas:\n\n```sh\n$ synthaser search --json_file sequences.json ...\n```\n\n### Using your own rules\nThough `synthaser` was originally designed to analyse secondary metabolite synthases,\nit can easily be repurposed to analyse the domain architectures of any type of protein sequence.\n\nUnder the hood, `synthaser` uses a central rule file which contains:\n1. Domain types, containing specific families to save in CD-Search results, corresponding to domain 'islands';\n2. Rules for classifying the sequences based on domain architecture predictions; and\n3. A hierarchy which determines the order of evaluation for the rules.\n\nWe distribute our fungal megasynthase rule file as the default, but providing your own rule file\nis as simple as:\n\n```sh\n$ synthaser search -qf sequences.fasta --rule_file my_rules.json\n```\n\nWe also provide a web application for assembling your own rule files, which can be\n[found here](https://gamcil.github.io/synthaser/).\n\nFor a detailed explanation of how the rule file works, as well as API documentation,\nplease refer to the [documentation](https://synthaser.readthedocs.io/en/latest/).\n\n## Citations\nIf you found `synthaser` helpful, please cite:\n\n```text\nGilchrist, C. L., \u0026 Chooi, Y. H. (2021).\nSynthaser: a CD-Search enabled Python toolkit for analysing domain architecture of fungal secondary metabolite megasynth (et) ases.\nFungal Biology and Biotechnology, 8(1), 1-19.\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgamcil%2Fsynthaser","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgamcil%2Fsynthaser","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgamcil%2Fsynthaser/lists"}