https://github.com/tmsincomb/seqpandas
Import genomic data to get a Pandas & Biopython hybrid with fancy shortcuts to make Machine Learning preprocessing easy!
https://github.com/tmsincomb/seqpandas
Last synced: 4 months ago
JSON representation
Import genomic data to get a Pandas & Biopython hybrid with fancy shortcuts to make Machine Learning preprocessing easy!
- Host: GitHub
- URL: https://github.com/tmsincomb/seqpandas
- Owner: tmsincomb
- License: mit
- Created: 2020-05-08T18:51:05.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2022-12-26T21:11:28.000Z (over 2 years ago)
- Last Synced: 2025-01-05T03:05:11.795Z (5 months ago)
- Language: Jupyter Notebook
- Homepage:
- Size: 10.8 MB
- Stars: 3
- Watchers: 2
- Forks: 0
- Open Issues: 7
-
Metadata Files:
- Readme: README.rst
- Changelog: HISTORY.rst
- Contributing: CONTRIBUTING.rst
- License: LICENSE
Awesome Lists containing this project
README
=========
SeqPandas
=========
Import genomic data to get a custom Pandas & Biopython hybrid class object with fancy shortcuts to make Machine Learning preprocessing easy!* Free software: MIT license
* Documentation: https://seqpandas.readthedocs.io.Installation
------------.. code:: bash
pip install seqpandasUsage
-----.. code:: python
import seqpandas as spd
# Direct File Path
df = spd.read_seq('file.fasta', format='fasta')
df = spd.read_seq('file.sam', format='sam')
df = spd.read_vcf('file.vcf', format='vcf')
df = spd.read_bed('file.bed', format='bed')# Just need BioPython Seqs? No problem!
seqrecords = spd.read('file.fasta', format='fasta')# Already Opened BioPython Handle
from Bio import SeqIO
seqrecords = SeqIO.parse('file.fasta', format='fasta')
df = spd.BioDataFrame.from_seqrecords(seqrecords)Tutorial
--------
For a complete walkthrough and to use it for a machine learning pipeline please follow the `tutorial notebook `_.Credits
-------This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.