Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/midnighter/conifer-analysis
Post-process conifer output for downstream statistical analysis.
https://github.com/midnighter/conifer-analysis
Last synced: 11 days ago
JSON representation
Post-process conifer output for downstream statistical analysis.
- Host: GitHub
- URL: https://github.com/midnighter/conifer-analysis
- Owner: Midnighter
- License: apache-2.0
- Created: 2022-03-04T20:19:56.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2022-03-04T21:05:17.000Z (over 2 years ago)
- Last Synced: 2024-10-31T16:51:05.221Z (20 days ago)
- Language: Python
- Size: 35.2 KB
- Stars: 2
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.rst
- Changelog: CHANGELOG.rst
- Contributing: .github/CONTRIBUTING.rst
- License: LICENSE
- Code of conduct: .github/CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
=============================
Conifer Analysis
=============================.. image:: https://img.shields.io/pypi/v/conifer-analysis.svg
:target: https://pypi.org/project/conifer-analysis/
:alt: Current PyPI Version.. image:: https://img.shields.io/pypi/pyversions/conifer-analysis.svg
:target: https://pypi.org/project/conifer-analysis/
:alt: Supported Python Versions.. image:: https://img.shields.io/pypi/l/conifer-analysis.svg
:target: https://www.apache.org/licenses/LICENSE-2.0
:alt: Apache Software License Version 2.0.. image:: https://img.shields.io/badge/Contributor%20Covenant-v2.0%20adopted-ff69b4.svg
:target: .github/CODE_OF_CONDUCT.md
:alt: Code of Conduct.. image:: https://github.com/Midnighter/conifer-analysis/workflows/CI-CD/badge.svg
:target: https://github.com/Midnighter/conifer-analysis/workflows/CI-CD
:alt: GitHub Actions.. image:: https://codecov.io/gh/Midnighter/conifer-analysis/branch/master/graph/badge.svg
:target: https://codecov.io/gh/Midnighter/conifer-analysis
:alt: Codecov.. image:: https://img.shields.io/badge/code%20style-black-000000.svg
:target: https://github.com/ambv/black
:alt: Code Style Black.. image:: https://readthedocs.org/projects/conifer-analysis/badge/?version=latest
:target: https://conifer-analysis.readthedocs.io/en/latest/?badge=latest
:alt: Documentation Status.. summary-start
Post-process conifer output for downstream statistical analysis.
``conifer-analysis`` uses `dask `_ in order to analyze
`conifer `_ results in a distributed and
out-of-memory fashion. This can be helpful when processing many such results.Example
=======Say that you have a bunch of ``conifer`` results in a directory. You can
generate a histogram of the confidence values per file (sample) and per taxa
using the provided pipeline ``confidence_hist``. Even when you work locally, it
can be helpful to explicitly create a distributed client controlling the number
of workers... code-block:: python
from dask.distributed import Client
from conifer_analysis import confidence_histclient = Client(n_workers=8)
You can then visit the `default dashboard `_ in
your browser to observe tasks live. Next, we run the pipeline which returns a
``pandas.DataFrame``... code-block:: python
hist = confidence_hist("data/*.tsv")
hist.info()As an example of the returned shape:
.. code-block:: console
RangeIndex: 7700 entries, 0 to 7699
Data columns (total 8 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 path 7700 non-null category
1 name 7700 non-null category
2 taxonomy_id 7700 non-null category
3 bin 7700 non-null interval[float64, right]
4 midpoints 7700 non-null float64
5 read1_hist 7700 non-null int64
6 read2_hist 7700 non-null int64
7 avg_hist 7700 non-null int64
dtypes: category(3), float64(1), int64(3), interval(1)
memory usage: 385.3 KBInstall
=======It's as simple as:
.. code-block:: console
pip install conifer-analysis
If you want to observe tasks in the dask dashboard, you will need additional
dependencies... code-block:: console
pip install conifer-analysis[dashboard]
Copyright
=========* Copyright © 2022, Moritz E. Beber.
* Free software distributed under the `Apache Software License 2.0
`_... summary-end