https://github.com/berraylvc/bioseq-analyzer

DNA FASTA sequence analyzer built with Biopython
https://github.com/berraylvc/bioseq-analyzer

bioinformatics biopython cli fasta python sequence-analysis

Last synced: about 1 month ago
JSON representation

DNA FASTA sequence analyzer built with Biopython

Host: GitHub
URL: https://github.com/berraylvc/bioseq-analyzer
Owner: berraylvc
License: mit
Created: 2025-12-21T23:51:06.000Z (6 months ago)
Default Branch: main
Last Pushed: 2025-12-23T08:34:30.000Z (6 months ago)
Last Synced: 2025-12-23T11:20:37.334Z (6 months ago)
Topics: bioinformatics, biopython, cli, fasta, python, sequence-analysis
Language: Python
Homepage:
Size: 20.5 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md

Awesome Lists containing this project

README

![CI](https://github.com/berraylvc/bioseq-analyzer/actions/workflows/ci.yml/badge.svg)

# BioSeq Analyzer

A small DNA FASTA sequence analyzer I built while learning basic bioinformatics workflows with **Biopython**.
It reads FASTA files and prints a short report (length, GC%, base counts, transcription/translation previews).
It can also export results to **JSON** and **CSV**.

## What it does

- Parse DNA FASTA files (Biopython `SeqIO`)
- Base counts: A/C/G/T/N (+ common IUPAC ambiguity codes if present)
- GC%, AT%, N%
- mRNA preview (T → U)
- Reverse complement preview
- Protein translation previews (frames 1/2/3)
- CLI tool: `bioseq`
- Unit tests + GitHub Actions CI

## Quickstart (Windows)

python -m venv .venv
.\.venv\Scripts\activate
pip install -e .
pip install pytest

Run on the example file:

bioseq --all --json report.json --csv report.csv

Run tests:

pytest -q

## Quickstart (Linux / macOS)

python3 -m venv .venv
source .venv/bin/activate
pip install -e .
pip install pytest

## Usage

Analyze a FASTA file:

bioseq path\to\your.fasta --all

See all options:

bioseq --help

## Output example

--- BioSeq Analyzer Report ---
Record ID : Example_DNA_1
Length : 39 bp
GC% : 56.41
Counts : {'A': 9, 'C': 8, 'G': 14, 'T': 8, 'N': 0}
Protein f1 (20) : MAIVMGR*KGAR*
-----------------------------

## Notes / Limitations

- This is a learning project (not a full bioinformatics pipeline).
- Large FASTA files may take time because records are currently loaded into memory.
- If sequence length is not a multiple of 3, Biopython may warn about partial codons.

## Project structure

src/bioseq_analyzer/ # package code
tests/ # pytest tests
data/ # sample fasta

## Docs

- API notes: `docs/API.md`
- Troubleshooting: `docs/TROUBLESHOOTING.md`

## License

MIT — see `LICENSE`.

## Author

Berra Özyalvaç (GitHub: `@berraylvc`)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/berraylvc/bioseq-analyzer

Awesome Lists containing this project

README