https://github.com/camilogarciabotero/biosimplex.jl
Representing BioSequences as Simplex numerical matrix
https://github.com/camilogarciabotero/biosimplex.jl
bioinformatics biojulia biology julia machine-learning
Last synced: 8 months ago
JSON representation
Representing BioSequences as Simplex numerical matrix
- Host: GitHub
- URL: https://github.com/camilogarciabotero/biosimplex.jl
- Owner: camilogarciabotero
- License: mit
- Created: 2024-02-11T21:53:44.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2024-03-23T22:44:54.000Z (over 2 years ago)
- Last Synced: 2024-04-25T00:03:45.742Z (about 2 years ago)
- Topics: bioinformatics, biojulia, biology, julia, machine-learning
- Language: Julia
- Homepage: https://camilogarciabotero.github.io/BioSimplex.jl/
- Size: 145 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE.md
Awesome Lists containing this project
README

Representing DNA sequences as regular tetrahedrals (Simplex)
[](https://camilogarciabotero.github.io/BioSimplex.jl/dev/)
[](https://github.com/camilogarciabotero/BioSimplex.jl/releases/latest)
[](https://doi.org/10.5281/zenodo.10775955)
[](https://github.com/camilogarciabotero/BioSimplex.jl/actions/workflows/CI.yml)
[](https://github.com/camilogarciabotero/BioSimplex.jl/blob/main/LICENSE)
[](https://www.repostatus.org/#wip)
[](https://pkgs.genieframework.com?packages=BioSimplex)
[](https://github.com/JuliaTesting/Aqua.jl)
# BioSimplex
> Representing DNA sequences as regular tetrahedrals (Simplex)
This packages has a single public function `biosimplex` that takes a `BioSequence` and returns a *Simplex* representation of a *BioSequence*. The *Simplex* representation is a 3D representation of the *BioSequence* where each base can be represented as unit vectors pointing into a regular tetrahedron (Silverman et al., 1986; Coward, 1997).
## Installation
BioSimplex is a
Julia Language
package. To install BioSimplex,
please open
Julia's interactive session (known as REPL) and press ]
key in the REPL to use the package mode, then type the following command
```julia
pkg> add BioSimplex
```
## Usage
```julia
using BioSequences, BioSimplex
# Create a BioSequence
seq = dna"ATCG"
# Convert the BioSequence to a Simplex representation
biosimplex(seq)
3×4 Matrix{Float64}:
0.0 0.942809 -0.471405 -0.471405
0.0 0.0 0.816497 -0.816497
1.0 -0.333333 -0.333333 -0.333333
```
## Applications
The *Simplex* representation is useful for to generate a numerical representation of the sequences so that it can be used in machine learning models.
## References
Coward, E. (1997). Equivalence of two Fourier methods for biological sequences. Journal of Mathematical Biology, 36(1), 64–70. https://doi.org/10.1007/s002850050090
Silverman, B. D., & Linsker, R. (1986). A measure of DNA periodicity. Journal of Theoretical Biology, 118(3), 295–300. https://doi.org/10.1016/S0022-5193(86)80060-1