{"id":41178867,"url":"https://github.com/sap218/acidoseq","last_synced_at":"2026-01-22T20:02:48.196Z","repository":{"id":57407945,"uuid":"143319375","full_name":"sap218/acidoseq","owner":"sap218","description":"A Python package for studying Acidobacteria","archived":false,"fork":false,"pushed_at":"2018-10-24T11:28:54.000Z","size":2257,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-12-21T22:59:48.398Z","etag":null,"topics":["bacteria","bioinformatics","nanopore","plotting","python","python3"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sap218.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-08-02T16:25:17.000Z","updated_at":"2021-09-22T14:11:09.000Z","dependencies_parsed_at":"2022-09-26T16:30:57.803Z","dependency_job_id":null,"html_url":"https://github.com/sap218/acidoseq","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/sap218/acidoseq","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sap218%2Facidoseq","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sap218%2Facidoseq/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sap218%2Facidoseq/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sap218%2Facidoseq/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sap218","download_url":"https://codeload.github.com/sap218/acidoseq/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sap218%2Facidoseq/sbom","scorecard":{"id":800225,"data":{"date":"2025-08-11","repo":{"name":"github.com/sap218/acidoseq","commit":"f6429d425a5436997ecb960b1faaf52756b67b35"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":3,"checks":[{"name":"Code-Review","score":0,"reason":"Found 0/30 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Dangerous-Workflow","score":-1,"reason":"no workflows found","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Token-Permissions","score":-1,"reason":"No tokens found","details":null,"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"SAST","score":0,"reason":"no SAST tool detected","details":["Warn: no pull requests merged into dev branch"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Maintained","score":0,"reason":"0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Pinned-Dependencies","score":-1,"reason":"no dependencies found","details":null,"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: MIT License: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":0,"reason":"branch protection not enabled on development/release branches","details":["Warn: branch protection not enabled for branch 'master'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}}]},"last_synced_at":"2025-08-23T10:14:45.697Z","repository_id":57407945,"created_at":"2025-08-23T10:14:45.697Z","updated_at":"2025-08-23T10:14:45.697Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28670329,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-22T19:36:09.361Z","status":"ssl_error","status_checked_at":"2026-01-22T19:36:05.567Z","response_time":144,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bacteria","bioinformatics","nanopore","plotting","python","python3"],"created_at":"2026-01-22T20:02:48.103Z","updated_at":"2026-01-22T20:02:48.189Z","avatar_url":"https://github.com/sap218.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# acidoseq\r\n\r\nStudying Acidobacteria reads from a **Nanopore** metagenomic data-set | **Python v3.5** | [PyPI](https://pypi.org/project/acidoseq/) (see version)\r\n\r\nAuthor __Samantha C Pendleton__, Data Science MSc Aberystwyth University, [Twitter](https://twitter.com/sap218) | [GitHub](https://github.com/sap218)\r\n\r\nFollow the Twitter bot I created, [acido_bot](https://twitter.com/acido_bot), that dispenses daily facts about Acidobacteria!\r\n\r\nThe **GC** content of the Acidobacteria genomes are consistent with their placements, e.g. species in the same subdivision (above 60\\% for group V fragments and roughly 10\\% lower for group III fragments) are similar, displaying the diversity within the phylum [1].\r\nThe abundance of the subdivisions correlate with pH depends on the subdivisions: 1, 2, 3, 12, 13 have a negative relationship as pH increases, whilst 4, 6, 7, 10, 11, 16, 17, 18, 22, 25 are sparse in low pH and have a positive relationship as pH increases [2].\r\n\r\nThis package includes studying a collection of reads and gathering the ones assigned as Acidobacteria from a Kaiju output. There are various statistical information and GC plots. Futhermore, the group of unclassified Acidobacteria reads are visualised into subdivisons based on the pH level of the soil sample.\r\n\r\n## Introduction\r\n[**Kaiju**](http://kaiju.binf.ku.dk) output provides taxon ID and the corredponding sequence, my package outputs the Acidobacteria species alongside annotation, plots, and information on the unclassified reads.\r\n\r\n###### Prerequisite\r\n* FASTA format of all the reads.\r\n* Kaiju output after extracting the two columns: sequence ID and NCBI taxIDs.\r\n\r\n###### Dependencies\r\n```\r\nimport os\r\nimport csv                                                                                                        \r\nimport pysam  \r\nimport collections\r\nimport matplotlib.pyplot as plt\r\nimport matplotlib.patches as mpatches\r\nimport random\r\nfrom termcolor import colored\r\nfrom colorama import init \r\nimport click\r\n```\r\n\r\n`$ pip3 install matplotlib`\r\n\r\n## Installation\r\n\r\n**GitClone**\r\n\r\n`$ git clone https://github.com/sap218/acidoseq.git`\r\n\r\n**pip**\r\n\r\n`$ pip install acidoseq`\r\n\r\n**Kaiju**\r\n\r\nI used the Kaiju output: columns 2 and 3 which included sequence references and the NCBI taxons.\r\n\r\n1. Filter the output with only classified labels\t`$ awk '$1 == \"C\"' kaiju.out \u003e kaijuC.out`\r\n2. Cut the columns\t\t\t\t\t`$ cut -f2,3 kaijuC.out \u003e results.txt`\r\n3. Converted the txt to csv (comma-delimted)\t\t`$ sed 's/\\s\\+/,/g' results.txt \u003e result_seqid_taxon.csv`\r\n\r\n## Map\r\nIf you are unsure of the pH of your soil samples, you may want to use the map script first - default city is Aberystwyth.\r\n\r\nPlease **note**: due to the fact that the Earth is spherical and maps are 2-dimensional, there will be some distortion when plotting locations.\r\n\r\n`$ acidomap --city Birmingham`\r\n\r\n## Usage\r\nCLI **needs** the Kaiju and FASTA file, all other options have defaults: e.g. pH = 5.\r\n\r\nIf no plot style was provided, or entered incorrectly, it will choose a random one.\r\n\r\nRun like followed with **Linux** (find how to [run with other operating systems here](https://en.wikibooks.org/wiki/Python_Programming/Creating_Python_Programs)):\r\n\r\n```\r\n$ acidoseq --help\r\nUsage: acidoseq [OPTIONS]\r\n\r\nOptions:\r\n  --taxdumptype TEXT  Study \"ALL\" or only unclassified \"U\"?\r\n  --kaijufile TEXT    Place edited Kaiju (csv) in directory for ease.\r\n  --fastapath TEXT    Place FASTA in directory for ease.\r\n  --style TEXT        ['seaborn-bright', 'seaborn-poster', 'seaborn-white',\r\n                      'bmh', 'seaborn-darkgrid', 'seaborn-pastel',\r\n                      'grayscale', '_classic_test', 'ggplot', 'seaborn-\r\n                      whitegrid', 'seaborn-dark', 'seaborn-muted', 'seaborn-\r\n                      colorblind', 'seaborn-ticks', 'Solarize_Light2',\r\n                      'seaborn-notebook', 'dark_background', 'fast',\r\n                      'seaborn', 'fivethirtyeight', 'seaborn-paper', 'seaborn-\r\n                      dark-palette', 'seaborn-talk', 'classic', 'seaborn-\r\n                      deep']\r\n  --plottype TEXT     \"span\" range of GC means OR \"line\" average mean GC\r\n  --ph TEXT           pH of soil, use map script for assistance.\r\n  --help              Show this message and exit.\r\n```\r\n\r\n###### Examples\r\n\r\n`$ acidoseq --kaijufile result_seqid_taxon.csv --fastapath all.fa`\r\n\r\n`$ acidoseq --taxdumptype ALL --kaijufile result_seqid_taxon.csv --fastapath all.fa --style ggplot --plottype span --ph 4.92`\r\n\r\n`$ acidoseq --taxdumptype U --kaijufile result_seqid_taxon.csv --fastapath all.fa --style seaborn --plottype line --ph 7.14`\r\n\r\n**Output**\r\n* FASTA file: a collection of reads which were identified as Acidobacteria\r\n* Plot of AT and GC ratio comparison with means \r\n* Indepth plot of GC ratio with subdivisions labelled (regions with 'span' and means with 'line')\r\n* Separate FASTA files of the unclassified reads assigned into subdivisions based on the pH, e.g. a file of sequences which reside in the subdivison 1 GC span if the pH is low\r\n\r\n## Acknowledgements\r\n* **Amanda Clare**, senior lecturer, MSc supervisor at Aberystwyth University, [Twitter](https://twitter.com/afcaber) | [GitHub](https://github.com/amandaclare) | [Staff Profile](https://www.aber.ac.uk/en/cs/staff-profiles/listing/profile/afc/)\r\n* **Sam Nicholls**, postdoc at University of Birmingham, [Twitter](https://twitter.com/samstudio8) | [GitHub](https://github.com/SamStudio8)\r\n* **Arwyn Edwards**, senior lecturer at Aberystwyth University, provided the data-set, [Twitter](https://twitter.com/arwynedwards) | [Staff Profile](https://www.aber.ac.uk/en/ibers/staff-profiles/listing/profile/aye/)\r\n\r\n## Thank you! :seedling:\r\n\r\nDon't hesitate to create an issue or make a suggestion!\r\n\r\n###### Todo List\r\n- [x] Make available\r\n- [x] Improve descriptions and comments\r\n- [x] Look into command line interface\r\n- [x] Fix code to output unclassified subdivisions based on pH\r\n- [ ] Alter code so the input file can be the original Kaiju output\r\n- [ ] Make available on Conda\r\n\r\n###### References\r\n[1] Quaiser, A., Ochsenreiter, T., Lanz, C., Schuster, S. C., Treusch, A. H., Eck, J., \u0026 Schleper, C. (2003). Acidobacteria form a coherent but highly diverse group within the bacterial domain: evidence from environmental genomics. Molecular microbiology, 50(2), 563-575.\r\n\r\n[2] Eichorst, S. A., Breznak, J. A., \u0026 Schmidt, T. M. (2007). Isolation and characterization of soil bacteria that define Terriglobus gen. nov., in the phylum Acidobacteria. Applied and environmental microbiology, 73(8), 2708-2717.\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsap218%2Facidoseq","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsap218%2Facidoseq","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsap218%2Facidoseq/lists"}