{"id":28176305,"url":"https://github.com/vmikk/eutax","last_synced_at":"2025-07-20T00:34:07.492Z","repository":{"id":287907923,"uuid":"966151069","full_name":"vmikk/eutax","owner":"vmikk","description":"Backend for taxonomic annotation UI","archived":false,"fork":false,"pushed_at":"2025-05-23T17:42:40.000Z","size":245,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-05-23T18:43:51.999Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/vmikk.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-04-14T13:33:51.000Z","updated_at":"2025-05-23T17:42:43.000Z","dependencies_parsed_at":"2025-04-29T10:32:16.704Z","dependency_job_id":"e7490b33-d695-4af9-a5ba-684d7dfb4ea8","html_url":"https://github.com/vmikk/eutax","commit_stats":null,"previous_names":["vmikk/eutax"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/vmikk/eutax","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vmikk%2Feutax","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vmikk%2Feutax/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vmikk%2Feutax/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vmikk%2Feutax/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/vmikk","download_url":"https://codeload.github.com/vmikk/eutax/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vmikk%2Feutax/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266048704,"owners_count":23868744,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-05-16T00:17:39.255Z","updated_at":"2025-07-20T00:34:07.481Z","avatar_url":"https://github.com/vmikk.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Taxonomic Annotation API\n\nA FastAPI-based server for taxonomic annotation of DNA sequences.\n\n## Features\n\n- Upload FASTA files containing DNA sequences\n- Run taxonomic annotation using BLAST or VSEARCH\n- Track job status and retrieve results\n- RESTful API with auto-generated documentation\n## Results Format\n\nThe `results.json` file contains taxonomic annotation results in a structured format.  \nThe exact content depends on the tool used (BLAST or VSEARCH), but follows this general structure:\n\n```json\n{\n  \"results\": [\n    { ... },    # matches for each query sequence\n    { ... },\n    ...\n  ],\n  \"summary\": {\n    \"total_queries\": 4,\n    \"total_hits\": 80\n  },\n  \"metadata\": {\n    \"tool\": \"blast\",\n    \"algorithm\": \"megablast\",\n    \"database\": {\n      \"identifier\": \"eukaryome_its\",\n      \"version\": \"1.9.4\"\n    },\n    \"job_id\": \"job123\"\n  }\n}\n```\n\nFor each query sequence, the results are stored in the `results` array. For example:\n``` json\n{\n  \"results\": [\n    {\n      \"query_id\": \"query_name\",\n      \"query_length\": 553,\n      \"hit_count\": 2,\n      \"hits\": [\n        {\n          \"sseqid\": \"EUK1101818;Fungi;Basidiomycota;Tremellomycetes;Filobasidiales;Piskurozymaceae;Solicoccozyma;aeria\",\n          \"taxonomy\": {\n            \"accession\": \"EUK1101818\",\n            \"kingdom\": \"Fungi\",\n            \"phylum\": \"Basidiomycota\",\n            \"class\": \"Tremellomycetes\",\n            \"order\": \"Filobasidiales\",\n            \"family\": \"Piskurozymaceae\",\n            \"genus\": \"Solicoccozyma\",\n            \"species\": \"aeria\"\n          },\n          \"pident\": 100.0,\n          \"length\": 553,\n          \"mismatch\": 0,\n          \"gapopen\": 0,\n          \"qstart\": 1,\n          \"qend\": 553,\n          \"sstart\": 1235,\n          \"send\": 1787,\n          \"evalue\": 0.0,\n          \"bitscore\": 998.0,\n          \"qcovs\": 100.0,\n          \"sstrand\": \"plus\",\n          \"slen\": 4063,\n          \"alignment\": {\n            \"qseq\":    \"GTGGGATTAAA...\",  # truncated\n            \"midline\": \"|||  |||||...\",   # truncated\n            \"sseq\":    \"GTGAATTAAA...\"    # truncated\n          }\n        },\n        { ... },  # other hits\n      ]\n    },\n    { ... },      # other query sequences\n    { ... }\n  ],\n  \"summary\":  { ... },\n  \"metadata\": { ... }\n}\n```\n\nEach hit contains the following information:\n\n- **sseqid**: Identifier of the reference sequence\n- **taxonomy**: Taxonomic classification object with fields (accession, kingdom, phylum, class, order, family, genus, species)\n- **pident**: Sequence identity percentage\n- **length**: Length of the alignment in nucleotides\n- **mismatch**: Number of mismatched positions\n- **gapopen**: Number of gap openings in the alignment\n- **qcovs**: Percentage of query covered by alignment\n- **evalue**: E-value (not available for VSEARCH output)\n- **bitscore**: Bit score (for VSEARCH, raw score is used)\n- **qstart** and **qend**: Start and end positions in query\n- **sstart** and **send**: Start and end positions in subject/target\n- **sstrand**: Alignment strand (e.g., \"plus\" or \"minus\")\n- **slen**: Length of the subject sequence\n- **alignment**: Object containing aligned sequences with fields:\n  - **qseq**: Aligned portion of query sequence\n  - **midline**: Match representation between sequences\n  - **sseq**: Aligned portion of subject sequence\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvmikk%2Feutax","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvmikk%2Feutax","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvmikk%2Feutax/lists"}