https://github.com/sing-group/seda
SEquence DAtaset builder
https://github.com/sing-group/seda
bioinformatics fasta fasta-sequences java sequence-dataset-builder sequences
Last synced: 10 months ago
JSON representation
SEquence DAtaset builder
- Host: GitHub
- URL: https://github.com/sing-group/seda
- Owner: sing-group
- License: gpl-3.0
- Created: 2018-03-23T13:21:31.000Z (about 8 years ago)
- Default Branch: master
- Last Pushed: 2025-02-03T11:00:47.000Z (about 1 year ago)
- Last Synced: 2025-02-03T12:19:48.270Z (about 1 year ago)
- Topics: bioinformatics, fasta, fasta-sequences, java, sequence-dataset-builder, sequences
- Language: Java
- Homepage: http://www.sing-group.org/seda/
- Size: 7.99 MB
- Stars: 5
- Watchers: 6
- Forks: 2
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# SEDA [](https://github.com/sing-group/seda) [](http://www.sing-group.org/seda/download.html)
SEDA (*SEquence DAtaset builder*) is an open source application for processing FASTA files containing DNA and protein sequences. Please, visit the [official web page](http://www.sing-group.org/seda) of the project for downloads, a [complete online manual](http://www.sing-group.org/seda/manual) and support.

## Main features
Among other functions, SEDA allows you to:
- Filter sequences based on different criteria (including text patterns).
- Translate nucleic acid sequences into amino acid sequences.
- Edit sequence headers in different ways.
- Remove duplicated sequences.
- Remove isoforms.
- Sort, merge, split, or reformat FASTA files.
- Use [BLAST](https://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs&DOC_TYPE=Download) to perform different types of queries.
- Use [Clustal Omega](http://www.clustal.org/omega/) to perform multiple sequence alignments.
- Perform gene annotation using different tools: Splign/Compart, ProSplign/ProCompart, Augustus (as implemented in SAPP), or the [Conserved Genome Annotation (CGA) Pipeline](https://github.com/pegi3s/cga).
## Debugging
In case you need see the commands executed by SEDA to run third-party software, just run SEDA with `-Dseda.execution.showcommands=true`.
## For programmers
Programmers can take advantage of the SEDA core to develop new operations to process FASTA files. In addition, SEDA has a plugin-based architecture, so new functions can be added to SEDA through plugins. Take a look at the [manual](https://www.sing-group.org/seda/manual/developers.html) for detailed information about this.
## Citing
Please, cite the following publication if you use SEDA:
- H. López-Fernández; P. Duque; N. Vázquez; F. Fdez-Riverola; M. Reboiro-Jato; C. P. Vieira; J. Vieira (2022) **SEDA: a Desktop Tool Suite for FASTA Files Processing**. *IEEE/ACM Transactions on Computational Biology and Bioinformatics*. Volume 19(3), pp. 1850-1860. [](https://doi.org/10.1109/TCBB.2020.3040383)
## Works using SEDA
- H. López-Fernández; P. Duque; S. Henriques; N. Vázquez; F. Fdez-Riverola; C.P. Vieira; M. Reboiro-Jato; J. Vieira (2018) **A bioinformatics protocol for quickly creating large-scale phylogenetic trees**. *12th International Conference on Practical Applications of Computational Biology & Bioinformatics: PACBB 2018*. Toledo, Spain. 20 - June [](https://doi.org/10.1007/978-3-319-98702-6_11)
- H. López-Fernández; P. Duque; S. Henriques; N. Vázquez; F. Fdez-Riverola; C.P. Vieira; M. Reboiro-Jato; J. Vieira (2018) **Bioinformatics Protocols for Quickly Obtaining Large-Scale Data Sets for Phylogenetic Inferences**. *Interdisciplinary Sciences: Computational Life Sciences* [](http://doi.org/10.1007/s12539-018-0312-5)
- H. López-Fernández; P. Duque; N. Vázquez; F. Fdez-Riverola; M. Reboiro-Jato; C.P. Vieira; J. Vieira (2019) **Inferring Positive Selection in Large Viral Datasets**. *13th International Conference on Practical Applications of Computational Biology & Bioinformatics: PACBB 2019*. Ávila, Spain. 26 - June [](https://doi.org/10.1007/978-3-030-23873-5_8)
## Credits
The Command-Line Interface (CLI) available from SEDA v1.6.0 was developed by David Vila Fernández as Master's Project.