{"id":29224565,"url":"https://github.com/sing-group/seda","last_synced_at":"2025-07-03T06:07:20.325Z","repository":{"id":47568426,"uuid":"126488193","full_name":"sing-group/seda","owner":"sing-group","description":"SEquence DAtaset builder","archived":false,"fork":false,"pushed_at":"2025-02-03T11:00:47.000Z","size":8377,"stargazers_count":5,"open_issues_count":4,"forks_count":2,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-02-03T12:19:48.270Z","etag":null,"topics":["bioinformatics","fasta","fasta-sequences","java","sequence-dataset-builder","sequences"],"latest_commit_sha":null,"homepage":"http://www.sing-group.org/seda/","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sing-group.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-03-23T13:21:31.000Z","updated_at":"2025-02-03T11:00:52.000Z","dependencies_parsed_at":"2022-09-07T02:22:27.279Z","dependency_job_id":"f72daae7-e9eb-4bf2-847a-e00a1f5f25b6","html_url":"https://github.com/sing-group/seda","commit_stats":null,"previous_names":[],"tags_count":38,"template":false,"template_full_name":null,"purl":"pkg:github/sing-group/seda","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sing-group%2Fseda","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sing-group%2Fseda/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sing-group%2Fseda/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sing-group%2Fseda/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sing-group","download_url":"https://codeload.github.com/sing-group/seda/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sing-group%2Fseda/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":263271501,"owners_count":23440396,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bioinformatics","fasta","fasta-sequences","java","sequence-dataset-builder","sequences"],"created_at":"2025-07-03T06:07:19.536Z","updated_at":"2025-07-03T06:07:20.315Z","avatar_url":"https://github.com/sing-group.png","language":"Java","readme":"# SEDA [![license](https://img.shields.io/github/license/sing-group/seda)](https://github.com/sing-group/seda) [![release](https://img.shields.io/github/release/sing-group/seda.svg)](http://www.sing-group.org/seda/download.html)\nSEDA (*SEquence DAtaset builder*) is an open source application for processing FASTA files containing DNA and protein sequences. Please, visit the [official web page](http://www.sing-group.org/seda) of the project for downloads, a [complete online manual](http://www.sing-group.org/seda/manual) and support.\n\n![SEDA Screenshot](seda-screenshot.png)\n\n## Main features\nAmong other functions, SEDA allows you to:\n- Filter sequences based on different criteria (including text patterns).\n- Translate nucleic acid sequences into amino acid sequences.\n- Edit sequence headers in different ways.\n- Remove duplicated sequences.\n- Remove isoforms.\n- Sort, merge, split, or reformat FASTA files.\n- Use [BLAST](https://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web\u0026PAGE_TYPE=BlastDocs\u0026DOC_TYPE=Download) to perform different types of queries.\n- Use [Clustal Omega](http://www.clustal.org/omega/) to perform multiple sequence alignments.\n- Perform gene annotation using different tools: Splign/Compart, ProSplign/ProCompart, Augustus (as implemented in SAPP), or the [Conserved Genome Annotation (CGA) Pipeline](https://github.com/pegi3s/cga).\n\n## Debugging\nIn case you need see the commands executed by SEDA to run third-party software, just run SEDA with `-Dseda.execution.showcommands=true`.\n\n## For programmers\nProgrammers can take advantage of the SEDA core to develop new operations to process FASTA files. In addition, SEDA has a plugin-based architecture, so new functions can be added to SEDA through plugins. Take a look at the [manual](https://www.sing-group.org/seda/manual/developers.html) for detailed information about this.\n\n## Citing\nPlease, cite the following publication if you use SEDA:\n- H. López-Fernández; P. Duque; N. Vázquez; F. Fdez-Riverola; M. Reboiro-Jato; C. P. Vieira; J. Vieira (2022) **SEDA: a Desktop Tool Suite for FASTA Files Processing**. *IEEE/ACM Transactions on Computational Biology and Bioinformatics*. Volume 19(3), pp. 1850-1860. [![DOI](https://img.shields.io/badge/doi-10.1109%2FTCBB.2020.3040383-blue)](https://doi.org/10.1109/TCBB.2020.3040383)\n\n## Works using SEDA\n- H. López-Fernández; P. Duque; S. Henriques; N. Vázquez; F. Fdez-Riverola; C.P. Vieira; M. Reboiro-Jato; J. Vieira (2018) **A bioinformatics protocol for quickly creating large-scale phylogenetic trees**. *12th International Conference on Practical Applications of Computational Biology \u0026 Bioinformatics: PACBB 2018*. Toledo, Spain. 20 - June [![DOI](https://img.shields.io/badge/doi-10.1007%2F978--3--319--98702--6__11-green.svg)](https://doi.org/10.1007/978-3-319-98702-6_11)\n- H. López-Fernández; P. Duque; S. Henriques; N. Vázquez; F. Fdez-Riverola; C.P. Vieira; M. Reboiro-Jato; J. Vieira (2018) **Bioinformatics Protocols for Quickly Obtaining Large-Scale Data Sets for Phylogenetic Inferences**. *Interdisciplinary Sciences: Computational Life Sciences* [![DOI](https://img.shields.io/badge/doi-10.1007%2Fs12539--018--0312--5-green.svg)](http://doi.org/10.1007/s12539-018-0312-5)\n- H. López-Fernández; P. Duque; N. Vázquez; F. Fdez-Riverola; M. Reboiro-Jato; C.P. Vieira; J. Vieira (2019) **Inferring Positive Selection in Large Viral Datasets**. *13th International Conference on Practical Applications of Computational Biology \u0026 Bioinformatics: PACBB 2019*. Ávila, Spain. 26 - June [![DOI](https://img.shields.io/badge/doi-10.1007%2F978--3--030--23873--5__8-green)](https://doi.org/10.1007/978-3-030-23873-5_8)\n\n## Credits\n\nThe Command-Line Interface (CLI) available from SEDA v1.6.0 was developed by David Vila Fernández as Master's Project.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsing-group%2Fseda","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsing-group%2Fseda","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsing-group%2Fseda/lists"}