{"id":31913452,"url":"https://github.com/alejandrogzi/bed2gff","last_synced_at":"2025-10-13T18:50:04.354Z","repository":{"id":197869858,"uuid":"699531686","full_name":"alejandrogzi/bed2gff","owner":"alejandrogzi","description":"cool BED-to-GFF3 converter that runs in parallel","archived":false,"fork":false,"pushed_at":"2025-04-17T08:39:11.000Z","size":113,"stargazers_count":9,"open_issues_count":2,"forks_count":2,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-09-17T13:28:12.522Z","etag":null,"topics":["bed","bioinformatics","gene-annotation","genome-annotation","gff3"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/alejandrogzi.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2023-10-02T20:21:24.000Z","updated_at":"2025-04-17T08:39:17.000Z","dependencies_parsed_at":"2023-10-03T05:30:25.355Z","dependency_job_id":"98f4c925-e7a5-4a0c-927e-464a6f4f43a9","html_url":"https://github.com/alejandrogzi/bed2gff","commit_stats":{"total_commits":33,"total_committers":1,"mean_commits":33.0,"dds":0.0,"last_synced_commit":"164b52abf3078ada069714cecc8bdc4c3019e248"},"previous_names":["alejandrogzi/bed2gff3"],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/alejandrogzi/bed2gff","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alejandrogzi%2Fbed2gff","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alejandrogzi%2Fbed2gff/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alejandrogzi%2Fbed2gff/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alejandrogzi%2Fbed2gff/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/alejandrogzi","download_url":"https://codeload.github.com/alejandrogzi/bed2gff/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alejandrogzi%2Fbed2gff/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279016622,"owners_count":26085853,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-13T02:00:06.723Z","response_time":61,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bed","bioinformatics","gene-annotation","genome-annotation","gff3"],"created_at":"2025-10-13T18:49:27.805Z","updated_at":"2025-10-13T18:50:04.339Z","avatar_url":"https://github.com/alejandrogzi.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"![Crates.io](https://img.shields.io/crates/v/bed2gff?color=green)\n![GitHub](https://img.shields.io/github/license/alejandrogzi/bed2gff?color=blue)\n![Crates.io Total Downloads](https://img.shields.io/crates/d/bed2gff)\n![Conda Platform](https://img.shields.io/conda/pn/bioconda/bed2gff)\n\n\n# **bed2gff**\n\nA Rust BED-to-GFF3 parallel translator.\n\n\ntranslates\n```\nchr7 56766360 56805692 ENST00000581852.25 1000 + 56766360 56805692 0,0,200 3 3,135,81, 0,496,39251,\n```\ninto\n```\nchr7 bed2gff gene 56399404 56805892 . + . ID=ENSG00000166960;gene_id=ENSG00000166960\n\nchr7 bed2gff transcript 56766361 56805692 . + . ID=ENST00000581852.25;Parent=ENSG00000166960;gene_id=ENSG00000166960;transcript_id=ENST00000581852.25\n\nchr7 bed2gff exon 56766361 56766363 . + . ID=exon:ENST00000581852.25.1;Parent=ENST00000581852.25;gene_id=ENSG00000166960;transcript_id=ENST00000581852.25,exon_number=1\n\nchr7 bed2gff CDS 56766361 56766363 . + 0 ID=CDS:ENST00000581852.25.1;Parent=ENST00000581852.25;gene_id=ENSG00000166960;transcript_id=ENST00000581852.25,exon_number=1\n\n...\n\nchr7 bed2gff start_codon 56766361 56766363 . + 0 ID=start_codon:ENST00000581852.25.1;Parent=ENST00000581852.25;gene_id=ENSG00000166960;transcript_id=ENST00000581852.25,exon_number=1\n\nchr7 bed2gff stop_codon 56805690 56805692 . + 0 ID=stop_codon:ENST00000581852.25.3;Parent=ENST00000581852.25;gene_id=ENSG00000166960;transcript_id=ENST00000581852.25,exon_number=3\n\n...\n```\n\nin a few seconds.\n\nConverts\n- *Homo sapiens* GRCh38 GENCODE 44 (252,835 transcripts) in 4.16 seconds.\n- *Mus musculus* GRCm39 GENCODE 44 (149,547 transcritps) in 2.15 seconds.\n- *Canis lupus* familiaris ROS_Cfam_1.0 Ensembl 110 (55,335 transcripts) in 1.30 seconds.\n- *Gallus gallus* bGalGal1 Ensembl 110 (72,689 transcripts) in 1.51 seconds.\n\n\u003e What's new on v.0.1.5\n\u003e\n\u003e - Adds `--no-gene` flag to only perform conversion without isoforms!\n\u003e - Modifies `-i` to be required unless `--no-gene` mode is present.\n\u003e - Refactors BedRecord.\n\n## Usage\n``` text\nUsage: \n    a) bed2gff[EXE] --bed \u003cBED\u003e --isoforms \u003cISOFORMS\u003e --output \u003cOUTPUT\u003e\n    b) bed2gff[EXE] --bed \u003cBED\u003e --output \u003cOUTPUT\u003e --no-gene\n\nArguments:\n    -b, --bed \u003cBED\u003e: a .bed file\n    -i, --isoforms \u003cISOFORMS\u003e: a tab-delimited file\n    -o, --output \u003cOUTPUT\u003e: path to output file\n    -n, --no-gene \u003cFLAG\u003e: Flag to disable gene_id feature [default: false]\n\nOptions:\n    --help: print help\n    --version: print version\n    --threads/-t: number of threads (default: max cpus)\n    --gz: compress output .gtf\n```\n\n\u003e[!WARNING] \n\u003e\n\u003eAll the transcripts in .bed file should appear in the isoforms file.\n#### crate: [https://crates.io/crates/bed2gff](https://crates.io/crates/bed2gff)\n\n\u003cdetails\u003e\n\u003csummary\u003eclick for detailed formats\u003c/summary\u003e\n\u003cp\u003e\nbed2gff just needs two files:\n\n1. a .bed file\n\n    tab-delimited files with 3 required and 9 optional fields:\n\n    ```\n    chrom   chromStart  chromEnd      name    ...\n      |         |           |           |\n    chr20   50222035    50222038    ENST00000595977    ...\n    ```\n\n    see [BED format](https://genome.ucsc.edu/FAQ/FAQformat.html#format1) for more information\n\n2. a tab-delimited .txt/.tsv/.csv/... file with genes/isoforms (all the transcripts in .bed file should appear in the isoforms file):\n\n    ```\n    \u003e cat isoforms.txt\n\n    ENSG00000198888 ENST00000361390\n    ENSG00000198763 ENST00000361453\n    ENSG00000198804 ENST00000361624\n    ENSG00000188868 ENST00000595977\n    ```\n\n    you can build a custom file for your preferred species using [Ensembl BioMart](https://www.ensembl.org/biomart/martview). \n\n\u003c/p\u003e\n\u003c/details\u003e\n\n## Installation\nto install bed2gff on your system follow this steps:\n1. get rust: `curl https://sh.rustup.rs -sSf | sh` on unix, or go [here](https://www.rust-lang.org/tools/install) for other options\n2. run `cargo install bed2gff` (make sure `~/.cargo/bin` is in your `$PATH` before running it)\n4. use `bed2gff` with the required arguments\n5. enjoy!\n\n## Build\nto build bed2gff from this repo, do:\n\n1. get rust (as described above)\n2. run `git clone https://github.com/alejandrogzi/bed2gff.git \u0026\u0026 cd bed2gff`\n3. run `cargo run --release -- -b \u003cBED\u003e -i \u003cISOFORMS\u003e -o \u003cOUTPUT\u003e`\n\n## Container image\nto build the development container image:\n1. run `git clone https://github.com/alejandrogzi/bed2gff.git \u0026\u0026 cd bed2gff`\n2. initialize docker with `start docker` or `systemctl start docker`\n3. build the image `docker image build --tag bed2gff .`\n4. run `docker run --rm -v \"[dir_where_your_gtf_is]:/dir\" bed2gff -b /dir/\u003cBED\u003e -i /dir/\u003cISOFORMS\u003e -o /dir/\u003cOUTPUT\u003e`\n\n## Conda\nto use bed2gff through Conda just:\n1. `conda install bed2gff -c bioconda` or `conda create -n bed2gff -c bioconda bed2gff`\n\n## Output\n\nbed2gff will send the output directly to the same .bed file path if you specify so\n\n```\nbed2gff annotation.bed isoforms.txt output.gff\n\n.\n├── ...\n├── isoforms.txt\n├── annotation.bed\n└── output.gff3\n```\nwhere `output.gff3` is the result.\n\n## FAQ\n### Why?\n\nConverting formats is a daily practice in bioinformatics. This is way more common while working with gene annotations as tools differ in input/output layouts. GTF/GFF/BED are the most used structures to store gene-related annotations and the conversion needs are not well covered by available software. \n\nA considerable portion of genomic tools reduce the software space by accepting GTF/GFF3 files only, directing BED users to translate their files into different formats. While some of this issues have already been covered (e.g. [bed2gtf](https://github.com/alejandrogzi/bed2gtf)) with GTF files, the GFF3 layout lacks stable converting tools (1, 2).\n\nbed2gff is presented as a straightforward option to convert BED files into ready-to-use GFF3 files, closing that gap.  \n\n\n### How?\nbed2gff, takes the base code of [bed2gtf](https://github.com/alejandrogzi/bed2gtf), that basically is the reimplementation of UCSC's C binaries merged in 1 step (bedToGenePred + genePredToGtf). This tool evaluates the position of exons and other features (CDS, stop/start, UTRs), preserving reading frames and adjusting the indexing count. The main approach now is a parallel algorithm that significantly reduces computation times. \n\nFollowing the rationale of [bed2gtf](https://github.com/alejandrogzi/bed2gtf), bed2gff is able to produce a ready-to-use gff3 file by using an isoforms file, that works as the refTable in C binaries to map each transcript to their respective gene. \n\n\n## References\n\n1. https://bioinformatics.stackexchange.com/questions/2242/how-to-convert-bed-to-gff3\n2. https://www.biostars.org/p/2/\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falejandrogzi%2Fbed2gff","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Falejandrogzi%2Fbed2gff","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falejandrogzi%2Fbed2gff/lists"}