Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/Euphrasiologist/nu_plugin_bio
Bioinformatics plugin for nushell.
https://github.com/Euphrasiologist/nu_plugin_bio
Last synced: 3 months ago
JSON representation
Bioinformatics plugin for nushell.
- Host: GitHub
- URL: https://github.com/Euphrasiologist/nu_plugin_bio
- Owner: Euphrasiologist
- Created: 2022-10-25T16:30:45.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2023-10-21T12:00:00.000Z (about 1 year ago)
- Last Synced: 2024-08-02T10:27:23.495Z (6 months ago)
- Language: Rust
- Size: 17.3 MB
- Stars: 23
- Watchers: 1
- Forks: 2
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome-nu - nu_plugin_bio
README
# Nushell bio
A bioinformatics plugin for nushell. This plugin parses most common bioinformatics formats into structured data so you can use them with nushell more effectively.
# Quick setup
Go and get nushell, it's great. I'm assuming you have the rust toolchain installed. Then come back!
```nu
# clone this repo
git clone https://github.com/Euphrasiologist/nu_plugin_bio
# change into the repo directory
cd nu_plugin_bio
# build
# it's quite a long compile time...
cargo build --release
# register the plugin
register nu_plugin_bio/target/release/nu_plugin_bio# see the current file formats currently supported below
# now you can just use open, and the file extension will be auto-detected.# there are some test files in the tests/ dir.
open ./tests/test.fasta
| get id# if you want to add flags you have to explicitly use from
# e.g. if you want descriptions in fasta files to be parsed.open --raw ./tests/test.fasta
| from fasta -d
| first
```The backend is a `noodles` wrapper, an excellent, all-Rust bioinformatics I/O library.
## Aims
Aim to support the following:
- [x] BAM 1.6
- [x] BCF 2.2
- [x] bcf.gz
- [x] VCF 4.3
- [x] vcf.gz
- [x] BED(3 only right now)
- [x] CRAM 3.0
- [x] FASTA
- [x] fa.gz
- [x] FASTQ
- [x] fq.gz
- [x] GFF3
- [ ] GTF 2.2
- [x] SAM 1.6
- [x] GFA 1.0
- [x] gfa.gzNote that performance will not be optimal with the current state of `nu_plugin`, as we cannot access the engine state of nushell, and therefore need to load entire data structures into memory. Testing still needs to be done on large files.
## More?
If there's a bioinformatics format you want to add, let me know, or add a PR.