{"id":26572631,"url":"https://github.com/ablab/viralVerify","last_synced_at":"2025-03-23T00:35:22.823Z","repository":{"id":51050968,"uuid":"212393443","full_name":"ablab/viralVerify","owner":"ablab","description":"viralVerify: viral contig verification tool","archived":false,"fork":false,"pushed_at":"2021-08-22T20:33:02.000Z","size":669,"stargazers_count":60,"open_issues_count":4,"forks_count":11,"subscribers_count":14,"default_branch":"master","last_synced_at":"2024-05-22T00:12:17.209Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ablab.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-10-02T16:50:55.000Z","updated_at":"2024-05-07T01:17:23.000Z","dependencies_parsed_at":"2022-09-10T21:23:10.207Z","dependency_job_id":null,"html_url":"https://github.com/ablab/viralVerify","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ablab%2FviralVerify","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ablab%2FviralVerify/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ablab%2FviralVerify/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ablab%2FviralVerify/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ablab","download_url":"https://codeload.github.com/ablab/viralVerify/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245040214,"owners_count":20551297,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-03-23T00:34:35.404Z","updated_at":"2025-03-23T00:35:22.809Z","avatar_url":"https://github.com/ablab.png","language":"Python","funding_links":[],"categories":["Genome Analysis"],"sub_categories":["Genome Completeness"],"readme":"# viralVerify: viral contig verification tool\n\n**Version: 1.1**\n\nviralVerify classifies contigs (output of metaviralSPAdes or other assemblers) as viral, non-viral or uncertain, \nbased on gene content. Also for non-viral contigs it can optionally provide plasmid/non-plasmid classification.\n\nviralVerify predicts genes in the contig using Prodigal in the metagenomic mode, runs hmmsearch on the predicted proteins \nand classifies the contig as vrial or non-viral by applying the Naive Bayes classifier (NBC). \nFor the set of predicted HMMs, viralVerify uses trained NBC to classify this set to be viral or chromosomal. \n\nTo improve results in the case of metagenomes with possible host contamination, we recommend users to filter out reads that align to the host genome prior to assembly.\nSince viralVerify is based on gene classification, it can be used on contigs on any length, and short viruses can be detected as long as they contain a recognizable virus-specific gene. To help analyze the rapidly growing amount of novel data, we have added a script that allows users to construct their own training database from a set of viral, chromosomal and plasmid contigs, as well as custom HMM database\n\n### Requirements\n\nviralVerify is a Python script, thus, installation is not required. However, it has the following dependencies:\n\n* Python 3.6+,\n* Prodigal (https://github.com/hyattpd/Prodigal, available via conda),\n* hmmsearch (from the hmmer package, http://hmmer.org/download.html),\n* provided *decompressed* database of virus/chromosome-specific HMMs (https://figshare.com/s/f897d463b31a35ad7bf0)\n\n or \n \n* recent release of the Pfam-A database (ftp://ftp.ebi.ac.uk/pub/databases/Pfam/releases/).\n\nTo work properly, viralVerify require Prodigal and hmmsearch in your PATH environment variable.\n\n\n### Optional BLAST verification\n\nYou can verify your output by BLAST to check if you found novel viruses or plasmids. In this case, you need to have blastn in your $PATH, Biopython installed, and provide a path to the nucleotide database (e.g. local copy of the NCBI nt database). For each contig we report information (e-value, query coverage, identity and subject title) about its best blast hit in the provided database.\n\n\n### Usage \n\n    viralverify \n            -f Input fasta file\n            -o output_directory \n            --hmm HMM  Path to HMM database\n\n            Optional arguments:\n            -h, --help  Show the help message and exit\n            --db DB     Run BLAST on input contigs against provided db\n            -t          Number of threads\n            -thr THR    Sensitivity threshold (minimal absolute score to classify sequence, default = 7)\n            -p          Output predicted plasmidic contigs separately\n\n\nOutput file: comma-separated table *\u003cinput_file\u003e_result_table.csv*\n\nOutput format: contig name, prediction result, log-likelihood ratio, list of predicted HMMs\n  \nFasta files with prediction results can be found in the *Prediction_results_fasta* folder\n  \nTo decrease number of false positives (at the expense of potential false negatives) you may increase the detection threshold, provided as an optional argument.\n\n### Retraining classifier\n\nYou can retrain the classifier with your custom data using provided *training_script*. It takes viral, chromosomal and plasmid (optionally) training sequences in fasta format and set of HMMS, predict genes and HMM hits, and returns the frequency table. To use the retrained classifier, replace the \"classifier_table.txt\" file in the viralVerify directory with the obtained table.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fablab%2FviralVerify","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fablab%2FviralVerify","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fablab%2FviralVerify/lists"}