An open API service indexing awesome lists of open source software.

https://github.com/sing-group/seda

SEquence DAtaset builder
https://github.com/sing-group/seda

bioinformatics fasta fasta-sequences java sequence-dataset-builder sequences

Last synced: 10 months ago
JSON representation

SEquence DAtaset builder

Awesome Lists containing this project

README

          

# SEDA [![license](https://img.shields.io/github/license/sing-group/seda)](https://github.com/sing-group/seda) [![release](https://img.shields.io/github/release/sing-group/seda.svg)](http://www.sing-group.org/seda/download.html)
SEDA (*SEquence DAtaset builder*) is an open source application for processing FASTA files containing DNA and protein sequences. Please, visit the [official web page](http://www.sing-group.org/seda) of the project for downloads, a [complete online manual](http://www.sing-group.org/seda/manual) and support.

![SEDA Screenshot](seda-screenshot.png)

## Main features
Among other functions, SEDA allows you to:
- Filter sequences based on different criteria (including text patterns).
- Translate nucleic acid sequences into amino acid sequences.
- Edit sequence headers in different ways.
- Remove duplicated sequences.
- Remove isoforms.
- Sort, merge, split, or reformat FASTA files.
- Use [BLAST](https://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs&DOC_TYPE=Download) to perform different types of queries.
- Use [Clustal Omega](http://www.clustal.org/omega/) to perform multiple sequence alignments.
- Perform gene annotation using different tools: Splign/Compart, ProSplign/ProCompart, Augustus (as implemented in SAPP), or the [Conserved Genome Annotation (CGA) Pipeline](https://github.com/pegi3s/cga).

## Debugging
In case you need see the commands executed by SEDA to run third-party software, just run SEDA with `-Dseda.execution.showcommands=true`.

## For programmers
Programmers can take advantage of the SEDA core to develop new operations to process FASTA files. In addition, SEDA has a plugin-based architecture, so new functions can be added to SEDA through plugins. Take a look at the [manual](https://www.sing-group.org/seda/manual/developers.html) for detailed information about this.

## Citing
Please, cite the following publication if you use SEDA:
- H. López-Fernández; P. Duque; N. Vázquez; F. Fdez-Riverola; M. Reboiro-Jato; C. P. Vieira; J. Vieira (2022) **SEDA: a Desktop Tool Suite for FASTA Files Processing**. *IEEE/ACM Transactions on Computational Biology and Bioinformatics*. Volume 19(3), pp. 1850-1860. [![DOI](https://img.shields.io/badge/doi-10.1109%2FTCBB.2020.3040383-blue)](https://doi.org/10.1109/TCBB.2020.3040383)

## Works using SEDA
- H. López-Fernández; P. Duque; S. Henriques; N. Vázquez; F. Fdez-Riverola; C.P. Vieira; M. Reboiro-Jato; J. Vieira (2018) **A bioinformatics protocol for quickly creating large-scale phylogenetic trees**. *12th International Conference on Practical Applications of Computational Biology & Bioinformatics: PACBB 2018*. Toledo, Spain. 20 - June [![DOI](https://img.shields.io/badge/doi-10.1007%2F978--3--319--98702--6__11-green.svg)](https://doi.org/10.1007/978-3-319-98702-6_11)
- H. López-Fernández; P. Duque; S. Henriques; N. Vázquez; F. Fdez-Riverola; C.P. Vieira; M. Reboiro-Jato; J. Vieira (2018) **Bioinformatics Protocols for Quickly Obtaining Large-Scale Data Sets for Phylogenetic Inferences**. *Interdisciplinary Sciences: Computational Life Sciences* [![DOI](https://img.shields.io/badge/doi-10.1007%2Fs12539--018--0312--5-green.svg)](http://doi.org/10.1007/s12539-018-0312-5)
- H. López-Fernández; P. Duque; N. Vázquez; F. Fdez-Riverola; M. Reboiro-Jato; C.P. Vieira; J. Vieira (2019) **Inferring Positive Selection in Large Viral Datasets**. *13th International Conference on Practical Applications of Computational Biology & Bioinformatics: PACBB 2019*. Ávila, Spain. 26 - June [![DOI](https://img.shields.io/badge/doi-10.1007%2F978--3--030--23873--5__8-green)](https://doi.org/10.1007/978-3-030-23873-5_8)

## Credits

The Command-Line Interface (CLI) available from SEDA v1.6.0 was developed by David Vila Fernández as Master's Project.