https://github.com/berraylvc/bioseq-analyzer
DNA FASTA sequence analyzer built with Biopython
https://github.com/berraylvc/bioseq-analyzer
bioinformatics biopython cli fasta python sequence-analysis
Last synced: 8 days ago
JSON representation
DNA FASTA sequence analyzer built with Biopython
- Host: GitHub
- URL: https://github.com/berraylvc/bioseq-analyzer
- Owner: berraylvc
- License: mit
- Created: 2025-12-21T23:51:06.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2025-12-23T08:34:30.000Z (5 months ago)
- Last Synced: 2025-12-23T11:20:37.334Z (5 months ago)
- Topics: bioinformatics, biopython, cli, fasta, python, sequence-analysis
- Language: Python
- Homepage:
- Size: 20.5 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README

# BioSeq Analyzer
A small DNA FASTA sequence analyzer I built while learning basic bioinformatics workflows with **Biopython**.
It reads FASTA files and prints a short report (length, GC%, base counts, transcription/translation previews).
It can also export results to **JSON** and **CSV**.
## What it does
- Parse DNA FASTA files (Biopython `SeqIO`)
- Base counts: A/C/G/T/N (+ common IUPAC ambiguity codes if present)
- GC%, AT%, N%
- mRNA preview (T → U)
- Reverse complement preview
- Protein translation previews (frames 1/2/3)
- CLI tool: `bioseq`
- Unit tests + GitHub Actions CI
## Quickstart (Windows)
python -m venv .venv
.\.venv\Scripts\activate
pip install -e .
pip install pytest
Run on the example file:
bioseq --all --json report.json --csv report.csv
Run tests:
pytest -q
## Quickstart (Linux / macOS)
python3 -m venv .venv
source .venv/bin/activate
pip install -e .
pip install pytest
## Usage
Analyze a FASTA file:
bioseq path\to\your.fasta --all
See all options:
bioseq --help
## Output example
--- BioSeq Analyzer Report ---
Record ID : Example_DNA_1
Length : 39 bp
GC% : 56.41
Counts : {'A': 9, 'C': 8, 'G': 14, 'T': 8, 'N': 0}
Protein f1 (20) : MAIVMGR*KGAR*
-----------------------------
## Notes / Limitations
- This is a learning project (not a full bioinformatics pipeline).
- Large FASTA files may take time because records are currently loaded into memory.
- If sequence length is not a multiple of 3, Biopython may warn about partial codons.
## Project structure
src/bioseq_analyzer/ # package code
tests/ # pytest tests
data/ # sample fasta
## Docs
- API notes: `docs/API.md`
- Troubleshooting: `docs/TROUBLESHOOTING.md`
## License
MIT — see `LICENSE`.
## Author
Berra Özyalvaç (GitHub: `@berraylvc`)