{"id":25977596,"url":"https://github.com/nylander/translate_fasta_headers","last_synced_at":"2025-03-05T04:38:40.098Z","repository":{"id":7445121,"uuid":"8786689","full_name":"nylander/translate_fasta_headers","owner":"nylander","description":"Translate long fasta headers to short - and back!","archived":false,"fork":false,"pushed_at":"2024-04-17T13:49:02.000Z","size":340,"stargazers_count":4,"open_issues_count":0,"forks_count":3,"subscribers_count":3,"default_branch":"main","last_synced_at":"2024-04-17T20:08:33.682Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Perl","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nylander.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2013-03-14T22:18:50.000Z","updated_at":"2023-01-30T13:31:51.000Z","dependencies_parsed_at":"2023-01-11T18:45:43.948Z","dependency_job_id":null,"html_url":"https://github.com/nylander/translate_fasta_headers","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nylander%2Ftranslate_fasta_headers","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nylander%2Ftranslate_fasta_headers/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nylander%2Ftranslate_fasta_headers/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nylander%2Ftranslate_fasta_headers/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nylander","download_url":"https://codeload.github.com/nylander/translate_fasta_headers/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241966989,"owners_count":20050324,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-03-05T04:38:39.489Z","updated_at":"2025-03-05T04:38:40.092Z","avatar_url":"https://github.com/nylander.png","language":"Perl","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Translate fasta headers\n\nTranslate long fasta headers to short - and back!\n\nYour alignment program X doesn't allow strings longer than n characters, but\nall your info is in the fasta headers of your file. What to do?\n\nUse `translate_fasta_headers.pl` on your fasta file to create short labels and\na translation table. Run your program X, and then back-translate your fasta\nheaders by running `translate_fasta_headers.pl` again!\n\nAnd if you created a tree with the short (or long) labels, try to\nback-translate using `replace_taxon_labels_in_newick.pl`!\n\nIf you only wish to transform your long fasta headers to short, without keeping\nthe information about how they where translated, the quick solution might be to\nuse `awk`:\n\n    $ awk '/\u003e/{$0=\"\u003eSeq_\"++n}1' long.fas\n\nBut, if you want to be able to back-translate, read on!\n\n## Description\n\nReplace fasta headers with headers taken from tab delimited file. If no tab\nfile is given, the (potentially long) fasta headers are replaced by short\nlabels \"Seq\\_1\", \"Seq\\_2\", etc, and the short and original headers are printed\nto a translation file.\n\nIf you wish, you may choose your own prefix (instead of `Seq_`). This could be\nhandy if, for example, you wish to concatenate files.\n\nThe script for translating labels in Newick trees is somewhat limited in\ncapacity due to the restrictions and/or peculiarities of the Newick tree\nformat. Use with caution.\n\n## Usage\n\n    $ translate_fasta_headers.pl [options] \u003cfasta file\u003e\n    $ replace_taxon_labels_in_newick.pl [options] \u003cnewick file\u003e\n\n## Examples\n\nFrom long to short labels:\n\n    $ translate_fasta_headers.pl --out=short.fas long.fas\n\nAnd back, using a translation table:\n\n    $ translate_fasta_headers.pl --tabfile=short.fas.translation.tab short.fas\n\nSlightly shorter version (see note about the `--out` option below):\n\n    $ translate_fasta_headers.pl long.fas \u003e short.fas\n    $ translate_fasta_headers.pl -t long.fas.translation.tab short.fas\n\nUse your own prefix:\n\n    $ translate_fasta_headers.pl --prefix='Own_' long.fas\n\nTranslate short seq labels in Newick tree to long:\n\n    $ replace_taxon_labels_in_newick.pl -t long.fas.translation.tab short.fas.phy\n\nPrint seq labels in Newick tree:\n\n    $ replace_taxon_labels_in_newick.pl -l short.fas.phy\n\n## Options\n\n### Script `translate_fasta_headers.pl`\n\n- `-t, --tabfile=\u003cfilename\u003e` --  Specify tab-separated translation file with\n  unique \"short\" labels to the left, and \"long\" names to the right. Translation\n  will be from left to right.\n- `-o, --out=\u003cfilename\u003e` --  Specify output file for the fasta sequences.\n  **Note**: If `--out=\u003cfilename\u003e` is specified, the translation file will be\n  named `\u003cfilename\u003e.translation.tab`. This simplifies back translation.  If, on\n  the other hand, `--out` is not used, the translation file will be named after\n  the infile!\n- `-i, --in=\u003cfilename\u003e` --  Specify name of fasta file. Can be skipped as\n  script reads files from STDIN.\n- `-n, --notab` --  Do not create a translation file.\n- `-p, --prefix=\u003cstring\u003e` --  User your own prefix (default is `Seq_`). A\n  numerical will be added to the labels (e.g. `Own_1`, `Own_2`, ...)\n- `-v, --version` --  Print version number and quit.\n- `-h, --help` --  Show this help text and quit.\n\n### Script `replace_taxon_labels_in_newick.pl`\n\n- `-t, --tabfile=\u003ctranslation.tab\u003e` --  File with table describing what will be\n  translated with what.\n- `-l,-p, --labels` -- Print taxon labels in tree. Option does not require a\n  translation table.\n- `--no-quotemeta` -- Turn off escaping of special symbols in the replacements.\n- `-o, --out=\u003cout.file\u003e` --  Print to outfile `out.file`, else to STDOUT.\n- `-v, --version` --  Print version number and quit.\n- `-h, --help` --  Help text.\n\n## Author\n\nJohan.Nylander\n\n## Files\n\n- [`translate_fasta_headers.pl`](translate_fasta_headers.pl) -- Perl script\n- [`replace_taxon_labels_in_newick.pl`](replace_taxon_labels_in_newick.pl) --  Perl script\n- [`data/long.fas`](data/long.fas) --  Example file with long fasta headers\n- [`data/short.fas.translation.tab`](data/short.fas.translation.tab) --  Example translation table\n- [`data/short.fas`](data/short.fas) --  Example output with short fasta headers\n- [`data/short.fas.phy`](data/short.fas.phy) --  Example Newick tree with short labels\n- [`README.md`](README.md) --  Documentation, markdown format\n- [`README.pdf`](README.pdf) --  Documentation, PDF format\n\n## License and Copyright\n\nCopyright (c) 2013-2024 Johan Nylander\n\n[LICENSE](LICENSE)\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnylander%2Ftranslate_fasta_headers","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnylander%2Ftranslate_fasta_headers","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnylander%2Ftranslate_fasta_headers/lists"}