https://github.com/biogenies/tidysq

tidy processing of biological sequences in R
https://github.com/biogenies/tidysq

bioconductor bioinformatics biological-sequences fasta r rstats s3 sequences tibble tidy tidyverse vctrs

Last synced: 11 months ago
JSON representation

tidy processing of biological sequences in R

Host: GitHub
URL: https://github.com/biogenies/tidysq
Owner: BioGenies
Created: 2019-07-03T14:01:19.000Z (about 7 years ago)
Default Branch: master
Last Pushed: 2025-01-08T18:53:55.000Z (over 1 year ago)
Last Synced: 2025-04-04T12:53:43.420Z (over 1 year ago)
Topics: bioconductor, bioinformatics, biological-sequences, fasta, r, rstats, s3, sequences, tibble, tidy, tidyverse, vctrs
Language: C++
Homepage: https://BioGenies.github.io/tidysq/
Size: 10.9 MB
Stars: 41
Watchers: 4
Forks: 4
Open Issues: 52
Metadata Files:
- Readme: README.Rmd
- Changelog: NEWS.md

Awesome Lists containing this project

README

          ---

output: github_document

---

```{r setup, include=FALSE}

knitr::opts_chunk$set(

  echo = TRUE,

  collapse = TRUE,

  comment = "#>",

  fig.path = "man/figures/README-",

  out.width = "100%"

)

```

# tidysq 

[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/tidysq)](https://cran.r-project.org/package=tidysq)

  [![Github Actions Build Status](https://github.com/BioGenies/tidysq/workflows/R-CMD-check-bioc/badge.svg)](https://github.com/BioGenies/tidysq/actions)

  [![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)

## Overview

`tidysq` contains tools for analysis and manipulation of biological sequences (including amino acid and nucleic acid -- e.g. RNA, DNA -- sequences). Two major features of this package are:

- effective compression of sequence data, allowing to fit larger datasets in **R**,

- compatibility with most of `tidyverse` universe, especially `dplyr` and `vctrs` packages, making analyses *tidier*.

## Getting started

[Try our quick start vignette](http://biogenies.info/tidysq/articles/quick-start.html) or [our exhaustive documentation](http://biogenies.info/tidysq/reference/index.html).

## Installation

The easiest way to install `tidysq` package is to download its latest version from CRAN repository:

```{r, eval=FALSE}

install.packages("tidysq")

```

Alternatively, it is possible to download the development version directly from GitHub repository:

```{r, eval=FALSE}

# install.packages("devtools")

devtools::install_github("BioGenies/tidysq")

```

## Example usage

```{r, message=FALSE}

library(tidysq)

```

```{r}

file <- system.file("examples", "example_aa.fasta", package = "tidysq")

sqibble <- read_fasta(file)

sqibble

sq_ami <- sqibble$sq

sq_ami

# Subsequences can be extracted with bite()

bite(sq_ami, 5:10)

# There are also more traditional functions

reverse(sq_ami)

# find_motifs() returns a whole tibble of useful informations

find_motifs(sqibble, "^VHX")

```

An example of `dplyr` integration:

```{r, message=FALSE}

library(dplyr)

# tidysq integrates well with dplyr verbs

sqibble %>%

  filter(sq %has% "VFF") %>%

  mutate(length = get_sq_lengths(sq))

```

## Citation

For citation type:

```{r, eval=FALSE}

citation("tidysq")

```

or use:

Michal Burdukiewicz, Dominik Rafacz, Laura Bakala, Jadwiga Slowik, Weronika Puchala, Filip Pietluch, Katarzyna Sidorczuk, Stefan Roediger and Leon Eyrich Jessen (2021). tidysq: Tidy Processing and Analysis of Biological Sequences. R package version 1.1.3.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/biogenies/tidysq

Awesome Lists containing this project

README