Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ptynecki/PHAGES2050
PHAGES2050 - a Comprehensive AI-based Framework for Phage Research & Therapy
https://github.com/ptynecki/PHAGES2050
Last synced: about 2 months ago
JSON representation
PHAGES2050 - a Comprehensive AI-based Framework for Phage Research & Therapy
- Host: GitHub
- URL: https://github.com/ptynecki/PHAGES2050
- Owner: ptynecki
- License: mit
- Created: 2020-08-02T17:29:37.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2023-05-23T00:11:46.000Z (over 1 year ago)
- Last Synced: 2024-11-05T23:43:23.262Z (2 months ago)
- Language: Python
- Size: 96.7 KB
- Stars: 36
- Watchers: 4
- Forks: 5
- Open Issues: 8
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
- awesome-phages - PHAGES2050 - a comprehensive AI-based Framework for Phage Research & Therapy (Miscellaneous / Gene prediction-annotation)
README
> "Keep calm, use AI for phages and stop AMR"
PHAGES2050 is a novel [Python 3.8+ programming language](https://python.org) framework to boost bacteriophage research & therapy and infrastructure in order to achieve the full potential to fight against antimicrobial resistant bacteria within **Natural Language Processing (NLP)** and **Deep Learning**.
Our project is about developing a AI-based framework for microbiologists and bioinformaticians who hunt, explore and classify phages. Applying the framework will shorten the duration of computational methods required to match phages with bacteria for specific patient cases. Having such organised framework at hand and freely-available will help develop personalized phage therapy and make it accessible to people worldwide.
Watch the [PHAVES #3](https://www.youtube.com/watch?v=gh_Q135t9ps) talk to learn more.
[![Travis CI](https://travis-ci.com/ptynecki/PHAGES2050.svg?branch=master)](https://travis-ci.com/github/ptynecki/PHAGES2050)
[![codecov](https://codecov.io/gh/ptynecki/PHAGES2050/branch/master/graph/badge.svg)](https://codecov.io/gh/ptynecki/PHAGES2050)
[![Documentation Status](https://readthedocs.org/projects/phages2050/badge/?version=stable)](https://phages2050.readthedocs.io/en/stable/?badge=stable)
[![PyPI version](https://img.shields.io/pypi/v/phages2050.svg)](https://pypi.org/project/phages2050/)
[![PyPI license](https://img.shields.io/pypi/l/phages2050.svg)](https://pypi.python.org/pypi/)
[![PyPI pyversions](https://img.shields.io/pypi/pyversions/phages2050.svg)](https://pypi.python.org/pypi/phages2050/)
[![Code style](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![Downloads](https://static.pepy.tech/badge/phages2050)](https://pepy.tech/project/phages2050)## Table of Contents
[Framework modules](https://github.com/ptynecki/PHAGES2050#framework-modules) | [Usage](https://github.com/ptynecki/PHAGES2050#usage) | [Documentation](https://github.com/ptynecki/PHAGES2050#documentation) | [Installation](https://github.com/ptynecki/PHAGES2050#installation) | [Community and Contributions](https://github.com/ptynecki/PHAGES2050#community-and-contributions) | [Have a question?](https://github.com/ptynecki/PHAGES2050#have-a-question) | [Found a bug?](https://github.com/ptynecki/PHAGES2050#found-a-bug) | [Team](https://github.com/ptynecki/PHAGES2050#team) | [Change log](https://github.com/ptynecki/PHAGES2050#change-log) | [Code of Conduct](https://github.com/ptynecki/PHAGES2050#code-of-conduct) | [License](https://github.com/ptynecki/PHAGES2050#license)
## Framework modules
`crawlers` - set of functions responsible for bacteriophages data scraping from different sources (MillardLab, NCBI)
`features` - set of functions responsible for nucleotides and proteins feature extraction for Machine Learning classification and deeper analysis
`embeddings` - set of pre-trained Embedding models for nucleotides and proteins vectorization
`classifiers` - set of pre-trained Machine Learning models dedicated for bacteriophage research
`explore` - set of data visualization techniques in 2D or 3D dedicated for deeper bacteriophages exploration## Usage
The repository includes numerous examples of using the framework in Jupyter Notebook format (*.ipynb). The most expected ones by the community are listed below:
##### Crawlers
* [MillardLab bacteriophage crawler](https://github.com/ptynecki/PHAGES2050/blob/master/examples/crawlers/MillardLab-bacteriophage-crawler.ipynb)
* NCBI bacteriophages crawlers (planned):
* taxonomy, host and other expected meta-data;
* complete genome sequences in FASTA format;
* set of genes and proteins in FASTA format;##### Embeddings
* [Bacteriophage proteins embedding](https://github.com/ptynecki/PHAGES2050/blob/master/examples/embeddings/Bacteriophage-proteins-embedding.ipynb)
* [Bacteriophage DNA embedding](https://github.com/ptynecki/PHAGES2050/blob/master/examples/embeddings/Bacteriophage-nucleotides-embedding.ipynb)
* Bacteriophage sequence-based biological and biochemical features extraction (planned)##### Classifiers
* [Bacteriophage structural protein classifier with 95% of accuracy](https://github.com/ptynecki/PHAGES2050/blob/master/examples/classifiers/Bacteriophage-structural-protein-classifier.ipynb)
* Bacteriophage lifecycle classifier including chronic infection (planned)
* Bacteriophage taxonomy classifier (planned)
* Bacteriophage prophage detector and extractor (planned)
* Lysis zones multi-level-classification (in progress)##### Explore
* Bacteriophages in 3D space based on:
* DNA embedding (planned)
* proteins embedding (planned)
* biological and biochemical features (planned)
* custom user features (planned)## Documentation
The official documentation is hosted on ReadTheDocs: https://phages2050.readthedocs.io
## Installation
_PHAGES2050_ can be installed by running:
```
pip install phages2050
```It requires Python 3.8.0+ to run. You can also use Conda:
```
conda install -c conda-forge phages2050
```#### Install from GitHub
If you can't wait for the latest hotness and want to install from GitHub, use:
```
pip install git+git://github.com/ptynecki/PHAGES2050
```#### Proteins' embedding
If you want to use Bacteriophage proteins vectorizers then remember to install extra package for proteins embedding:
```
pip install -U "bio-embeddings[all] @ git+https://github.com/sacdallago/bio_embeddings.git"
pip install git+https://github.com/facebookresearch/esm.git
```## Community and Contributions
Happy to see you willing to make the PHAGES2050 better. Development on the latest stable version of Python 3+ is preferred. As of this writing it's 3.8. You can use any operating system.
If you're fixing a bug or adding a new feature, add a test with *[pytest](https://github.com/pytest-dev/pytest)* and check the code with *[Black](https://github.com/psf/black/)* and *[mypy](https://github.com/python/mypy)*. Before adding any large feature, first open an issue for us to discuss the idea with the core devs and community.
## Have a question?
Obviously if you have a private question or want to cooperate with us, you can always reach out to us directly via our [Phage Directory Slack](https://phage.directory/slack) (channel **#PHAGES2050**).
## Found a bug?
Feel free to add a new issue with a respective title and description on the [the PHAGES2050 repository](https://github.com/ptynecki/PHAGES2050/issues). If you already found a solution to your problem, we would be happy to review your pull request.
## Team
Core Developers, Domain Experts, Community Managers and Educators who contributing to PHAGES2050:
* Piotr Tynecki
* Yana Minina
* Iwona Świętochowska
* Przemysław Mitura
* Joanna Kazimierczak
* Arkadiusz Guziński
* Bogusław Zimnoch
* Jessica Sacher, PhD
* Shawna McCallin, PhD
* Marie-Agnes Petit, PhD
* Jan Zheng## Change log
The log's will become rather long. It moved to its own file.
See [CHANGELOG.md](https://github.com/ptynecki/PHAGES2050/blob/master/CHANGELOG.md).
## Code of Conduct
Everyone interacting in the PHAGES2050 project's development, issue trackers and Slack discussion is expected to follow the [Code of Conduct](https://github.com/ptynecki/PHAGES2050/blob/master/CODE_OF_CONDUCT.md).
## License
The PHAGES2050 package and pre-trained models are released under the under terms of [the MIT License](https://github.com/ptynecki/PHAGES2050/blob/master/LICENSE).