Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/smaegol/biostrext
Extension to Biostrings R package
https://github.com/smaegol/biostrext
Last synced: 10 days ago
JSON representation
Extension to Biostrings R package
- Host: GitHub
- URL: https://github.com/smaegol/biostrext
- Owner: smaegol
- License: mit
- Created: 2017-12-20T07:35:07.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2017-12-21T08:39:06.000Z (about 7 years ago)
- Last Synced: 2023-10-26T00:20:06.566Z (about 1 year ago)
- Language: R
- Size: 249 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# biostrext
Simple extension to [Biostrings](https://bioconductor.org/packages/release/bioc/html/Biostrings.html) R package[![Build Status](https://travis-ci.org/smaegol/biostrext.svg?branch=master)](https://travis-ci.org/smaegol/biostrext)
[![codecov](https://codecov.io/gh/smaegol/biostrext/branch/master/graph/badge.svg)](https://codecov.io/gh/smaegol/biostrext)## Description
Provide simple functions easing %GC calculation and plotting. %GC can be calculated for single DNAString or DNAStringSet objects. For the DNAString objects it is possible to calculate %GC using sliding window approach and plot %GC distribution with ggplot.
Another functionality allows to find the most frequent pattern of given length (k) in the provided sequence. It also allows plotting the positions of given pattern in the sequence.
## Functions
#### split_in_windows(x,window,overlap)
Splits a DNAString object using sliding window approach. Window parameter defines the size of the window whereas overlap defines the overlap between consecuting windows.
#### calc_GC_single(x)
Returns %GC for provided DNAString object (x)
#### calc_GC(x)
Returns %GC for provided DNAString or DNAStringSet object (x). In the case of DNAStringSet the vector containing %GC for each sequence in the set is returned
#### calc_windowed_GC(x,window,overlap)
Calculates %GC of provided DNAString object (x) using sliding window approach. Window parameter defines the size of the window whereas overlap defines the overlap between consecuting windows.
Return numeric vector containing %GC for each created chunk#### plot_GC_distribution(x,window,overlap)
Calculates %GC of provided DNAString object (x) using sliding window approach and creates the plot. Window parameter defines the size of the window whereas overlap defines the overlap between consecuting windows.
#### get_max_kmer(x,k)
Finds the most frequently occuring pattern of size k in the DNAString object (x)