{"id":19461674,"url":"https://github.com/plasmapower/intersections","last_synced_at":"2025-04-25T07:34:19.974Z","repository":{"id":57633489,"uuid":"92771084","full_name":"PlasmaPower/intersections","owner":"PlasmaPower","description":"Analyzes intersections between genes and sequences","archived":false,"fork":false,"pushed_at":"2017-09-14T18:30:36.000Z","size":31,"stargazers_count":1,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"master","last_synced_at":"2024-10-12T23:20:41.179Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/PlasmaPower.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-05-29T19:57:24.000Z","updated_at":"2021-06-29T20:16:54.000Z","dependencies_parsed_at":"2022-09-01T17:02:15.304Z","dependency_job_id":null,"html_url":"https://github.com/PlasmaPower/intersections","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PlasmaPower%2Fintersections","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PlasmaPower%2Fintersections/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PlasmaPower%2Fintersections/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PlasmaPower%2Fintersections/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/PlasmaPower","download_url":"https://codeload.github.com/PlasmaPower/intersections/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":223991344,"owners_count":17237476,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-10T17:43:19.538Z","updated_at":"2024-11-10T17:43:20.194Z","avatar_url":"https://github.com/PlasmaPower.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Intersections\nThis program finds the overlap of sequences and genes using format 6 blastn output (http://www.metagenomics.wiki/tools/blast/blastn-output-format-6)\n  \n  ```\n  qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore\n  Query_1\taccn|JISN01000002\t100.000\t28\t0\t0\t29\t56\t37930\t37957\t1.32e-08\t52.8\n  ```\n\nand gff3 output (from prokka)\n  \n  ```\n  ##gff-version 3\n  ##sequence-region accn_JISN01000001 1 334949\n  ...\n  accn_JISN01000001\tProdigal:2.6\tCDS\t240\t2849\t.\t+\t0\tID=NKHGEDLF_00001;Name=clpB;gene=clpB;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:Q7A6G6;locus_tag=NKHGEDLF_00001;product=Chaperone protein ClpB\n  ...\n  \u003eaccn_JISN01000001\n  AATTAATTATCGACCAAGAAAGTGTTTAAATTGGAAGTTTCCTTATGAAGTTTTAT\n  ...\n  ```\n\nLines 9 and 10 of the blastn output are compared to lines 4 and 5 of the gff3 file (section type 2) for overlap. Any number of bla files can be intersected with an equal number of MATCHING gff files.\n\n## Prerequisites\nFolder of .bla files and .gff files MATCHED by NAME (I.E. genome1.bla genome1.gff genome2.bla genome2.gff). Bla files are files created in blastn format 6 by the blasting of one or more sequences against the respective genome. Gff3 files are created (for example) by prokka v1.12 (http://www.vicbioinformatics.com/software.prokka.shtml) for a respective genome.\n\n## Installing\nFirst download rust (instructions from https://rustup.rs/)\n\n```\ncurl https://sh.rustup.rs -sSf | sh\n```\n\nThen download the crate for intersections\n\n```\ncargo +nightly install sequence-intersections\n```\n\nIntersections can then be found in ~/.cargo/bin/\nIf a previous version of intersections already exists in the directory use\n\n```\ncargo +nightly install -f sequence-intersections\n```\n\n## Output and Options\n\n| Column | Description |\n| --- | --- |\n| name | Name of gene according to gff file. Regions between two genes are denoted Between(GeneNameBefore, GeneNameAfter). Hypothetical proteins are denoted HypotheticalAfter(GeneName) or HypotheticalBefore(GeneName) |\n| product | Product of gene according to gff file. Same style as name. |\n| total_overlap | Amount of sequence which intersected at this gene. If a sequence of 31 in the blast in put file completely overlapped with this gene (IE blast was in ID_1 and spanned 1000-1031 and the gene was in ID_1 and spanned 1000-1500) then the total_overlap for this gene would add +31. |\n| genome_count | The number of genomes which had at least one sequence overlap this gene with at least 1 total_overlap. |\n| start_avg | The average start for this gene according to the gff file. | \n| start_stdev | The standard deviation of the start of this gene. |\n| end_avg | The average end for this gene according to the gff file. | \n| end_stdev | The standard deviation of the end of this gene. |\n| length_avg | The average span of each gene (# of nucleotides long). Is not related to start or end location but only length of the gene. | \n\n## Example\nExample blast and gff intersections at: https://github.com/dUmich/intersections-example\n\n## Errors\nRun with this command preceding to get warnings\n\n```\nRUST_LOG=warn \n```\n\n## Built with\n\n## Versioning \n\n## Authors\n* **Lee Bousfield** - *Free-lance code wizard* - [PlasmaPower](https://github.com/PlasmaPower)\n* **Daniel Harris** - *Researcher, Snitkin Lab, University of Michigan* - [dUmich](https://github.com/dUmich)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fplasmapower%2Fintersections","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fplasmapower%2Fintersections","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fplasmapower%2Fintersections/lists"}