https://github.com/csbiology/biofsharp
Open source bioinformatics and computational biology toolbox written in F#.
https://github.com/csbiology/biofsharp
amino-acids biocontainers bioinformatics bioinformatics-containers biology biostatistics dataprocessing datascience docker fsharp nucleotides sequence-analysis
Last synced: 6 months ago
JSON representation
Open source bioinformatics and computational biology toolbox written in F#.
- Host: GitHub
- URL: https://github.com/csbiology/biofsharp
- Owner: CSBiology
- License: mit
- Created: 2016-07-14T08:52:06.000Z (over 9 years ago)
- Default Branch: developer
- Last Pushed: 2024-05-14T07:38:35.000Z (over 1 year ago)
- Last Synced: 2025-03-29T00:08:07.704Z (7 months ago)
- Topics: amino-acids, biocontainers, bioinformatics, bioinformatics-containers, biology, biostatistics, dataprocessing, datascience, docker, fsharp, nucleotides, sequence-analysis
- Language: F#
- Homepage: https://csbiology.github.io/BioFSharp/
- Size: 311 MB
- Stars: 112
- Watchers: 14
- Forks: 33
- Open Issues: 36
-
Metadata Files:
- Readme: README.md
- Contributing: .github/CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README

[](https://www.nuget.org/packages/BioFSharp/)
[.svg)](https://fsharp.org/)BioFSharp is an open source bioinformatics and computational biology toolbox written in F#.
[](https://gitter.im/CSBiology/BioFSharp?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge)
| Build status (ubuntu and windows) | Test Coverage |
|---|---|
|  | [](https://codecov.io/gh/CSBiology/BioFSharp) |Core functionality
------------------In its core namespace, BioFSharp contains the basic data structures for common biological objects and their modification. Our type modeling starts at chemical elements, abstracts those to form formulas, and finally molecules of high biological relevance such as amino acids and nucleotides. Sequences of these molecules are modelled by BioCollections, which provide extensive functionality for investigating their real life counterparts.

Additionally, core algorithms for biological sequences such as alignments and pattern matching algorithms are implemented.
Besides the core functionality, BioFSharp has several namespaces as sub-projects with different scopes:
IO functionality
----------------The IO namespace aims to make data available and ease further processing. It contains read/write functions for a diverse set of biological file formats such as [Fasta](https://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs&DOC_TYPE=BlastHelp), [FastQ](https://www.ncbi.nlm.nih.gov/sra/docs/submitformats/#fastq-files), [GeneBank](https://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html) or [GFF](https://github.com/The-Sequence-Ontology/Specifications/blob/master/gff3.md), as well as helper function for searching on or transforming the input data. Wrappers for commonly used command line tools like [NCBI's Blast](https://www.ncbi.nlm.nih.gov/books/NBK153387/) assure interoperability with an array of existing bioinformatic workflows
BioDB functionality
-------------------The BioDB namespace offers API access to powerful popular databases like [GEO](https://www.ncbi.nlm.nih.gov/geo/) and [EBI(including SwissProt/Expasy)](https://www.ebi.ac.uk/). We additionally provide an API access for [FATool](http://iomiqsweb1.bio.uni-kl.de/), a webservice by our workgroup for querying functional annotations of proteins.
This project is netframework only and has a new home here: https://github.com/CSBiology/BioFSharp.BioDB
BioContainers functionality
----------------------The BioContainers namespace is our newest BioFSharp project and we are very excited about it! It is all about making common bioinformatics tools programmatically accessible from F#.
This is realized by making the containerized tool accessible via the Docker daemon. We wrap some functionality from
[Docker.DotNet](https://github.com/microsoft/Docker.DotNet) to communicate with the docker API while providing extensive, type safe bindings for already 9 tools, including Blast, ClustalO, and TMHMMML functionality
----------------Make your workflow ML ready with BioFSharp.ML. Currently contains helper functionf for [CNTK](https://docs.microsoft.com/en-us/cognitive-toolkit/) and a pre-trained model we used in our [publication about predicting peptide observability](https://www.frontiersin.org/articles/10.3389/fpls.2018.01559/full).
Stats functionality
----------------------The Stats namespace contains statistical functions with a clear biological focus such as functions for calculating Gene Ontology Enrichments.
Documentation
-------------Functions, types and Classes contained in BioFSharp come with short explanatory description, which can be found in the [API Reference](https://csbiology.github.io/BioFSharp/reference/index.html).
More indepth explanations, tutorials and general information about the project can be found [here](http://csbiology.github.io/BioFSharp).
The documentation and tutorials for this library are automatically generated (using the F# Formatting) from *.fsx and *.md files in the docs folder. If you find a typo, please submit a pull request!
Contributing
------------Please refer to the [Contribution guidelines](.github/CONTRIBUTING.md)
Community/Social
----------------
Want to get in touch with us? We recently joined the twitter crowd:[](https://twitter.com/biofsharp)
[](https://twitter.com/cs_biology)