https://github.com/tmsincomb/seqpandas

Import genomic data to get a Pandas & Biopython hybrid with fancy shortcuts to make Machine Learning preprocessing easy!
https://github.com/tmsincomb/seqpandas

Last synced: 6 months ago
JSON representation

Import genomic data to get a Pandas & Biopython hybrid with fancy shortcuts to make Machine Learning preprocessing easy!

Host: GitHub
URL: https://github.com/tmsincomb/seqpandas
Owner: tmsincomb
License: mit
Created: 2020-05-08T18:51:05.000Z (about 5 years ago)
Default Branch: master
Last Pushed: 2022-12-26T21:11:28.000Z (over 2 years ago)
Last Synced: 2025-01-05T03:05:11.795Z (6 months ago)
Language: Jupyter Notebook
Homepage:
Size: 10.8 MB
Stars: 3
Watchers: 2
Forks: 0
Open Issues: 7
Metadata Files:
- Readme: README.rst
- Changelog: HISTORY.rst
- Contributing: CONTRIBUTING.rst
- License: LICENSE

Awesome Lists containing this project

README

        =========

SeqPandas 

=========

Import genomic data to get a custom Pandas & Biopython hybrid class object with fancy shortcuts to make Machine Learning preprocessing easy!

* Free software: MIT license

* Documentation: https://seqpandas.readthedocs.io.

Installation

------------

.. code:: bash

    

    pip install seqpandas

Usage

-----

.. code:: python

    import seqpandas as spd

    # Direct File Path

    df = spd.read_seq('file.fasta', format='fasta')

    df = spd.read_seq('file.sam', format='sam')

    df = spd.read_vcf('file.vcf', format='vcf')

    df = spd.read_bed('file.bed', format='bed')

    # Just need BioPython Seqs? No problem!

    seqrecords = spd.read('file.fasta', format='fasta')

    # Already Opened BioPython Handle

    from Bio import SeqIO

    seqrecords = SeqIO.parse('file.fasta', format='fasta')

    df = spd.BioDataFrame.from_seqrecords(seqrecords)

Tutorial

--------

For a complete walkthrough and to use it for a machine learning pipeline please follow the `tutorial notebook `_.

Credits

-------

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/tmsincomb/seqpandas

Awesome Lists containing this project

README