Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/whitead/peplib

Peptide library methods
https://github.com/whitead/peplib

Last synced: about 1 month ago
JSON representation

Peptide library methods

Awesome Lists containing this project

README

        

Peptide Library Analysis Methods
=========================

This package provides a variety of methods for dealing with analysis
of peptide library data, including clustering, motif finding, and QSAR
model fitting. It is for the R programming language and is described
in a [recent paper](http://pubs.acs.org/doi/full/10.1021/ci300484q).

Installing From Source (preferred)
----------

To install the latest version from the source here, use:

wget https://github.com/whitead/peplib/archive/master.zip
unzip master.zip && rm master.zip
R CMD build peplib-master
sudo R CMD INSTALL peplib_*.tar.gz

Installing From CRAN
------------------
To install from CRAN, type the following command from
an R session

install.packages("peplib")

Documentation
--------------------
Check out the `tutorial.pdf` file. It contains
many more details than the brief information below and detailed
tutorials.

Loading Sequences
--------------------
The easiest way to load sequences is to use the `read.sequences` method.

seq <- read.sequences("seqfile.txt")

where `seqfile.txt` looks like:

FDDSDF
FDSA
GGHIT

For most of the methods, it's recommended to have the same length for all sequences.

Calculating Peptide Descriptors
----------------------------

To calculate descriptors on your sequences, use:

seq.desc <- simpleDescriptors(seq)

That will calculate about 10 descriptors. To calculate a few hundred, type

seq.desc <- descriptors(seq)

These descriptors are all relative to glycine. So, for example,
molecular weight is not the actual molecular weight but the difference
between a given amino acid and glycine.

Plotting Sequences
--------------------

One nice feature of peplib is the ability to plot sequences with a
combination of a finding the substitution distance between sequences
and then projecting that distance matrix to 2 dimensions. This may be
done like so:

plot(seq)

This method also clusters your sequences assuming that there are 3 clusters. That may be changed by adding one argument:

plot(seq, 1)
plot(seq, 5)