{"id":18260395,"url":"https://github.com/carrascomj/ripkmer","last_synced_at":"2025-04-08T23:57:02.661Z","repository":{"id":105856492,"uuid":"242580066","full_name":"carrascomj/ripkmer","owner":"carrascomj","description":"Playing around with kmers very fast!","archived":false,"fork":false,"pushed_at":"2020-04-15T07:30:57.000Z","size":21,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-02-14T18:38:10.570Z","etag":null,"topics":["bioinformatics","kmer","rust","rust-bio"],"latest_commit_sha":null,"homepage":null,"language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/carrascomj.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-02-23T19:46:18.000Z","updated_at":"2020-10-08T17:58:10.000Z","dependencies_parsed_at":null,"dependency_job_id":"f8b2f256-193a-454e-abb1-8e8b24eb6953","html_url":"https://github.com/carrascomj/ripkmer","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/carrascomj%2Fripkmer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/carrascomj%2Fripkmer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/carrascomj%2Fripkmer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/carrascomj%2Fripkmer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/carrascomj","download_url":"https://codeload.github.com/carrascomj/ripkmer/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247947842,"owners_count":21023065,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bioinformatics","kmer","rust","rust-bio"],"created_at":"2024-11-05T10:45:07.866Z","updated_at":"2025-04-08T23:57:02.645Z","avatar_url":"https://github.com/carrascomj.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ripkmer\n[![Build Status](https://travis-ci.com/carrascomj/ripkmer.svg?branch=master)](https://travis-ci.com/carrascomj/ripkmer)  \nThere are two ways of viewing this:\n\n- Some k-mer algorithms using [Rust-Bio](https://github.com/rust-bio/rust-bio/) [[1]](#koster2016).\n- My first project in Rust just to get confident with it.\n\n## Features\nThe first idea is to reproduce in Rust the [KmerFinder](https://bitbucket.org/genomicepidemiology/kmerfinder/src/master/) [[2]](#kmerfinder2014)\n(in Python, but also in [JavaScript](https://github.com/yosoyubik/kmerfinderjs-docker)).\n\n* [x] K-mer count on FASTQ.\n* [x] Filter by prefix.\n* [ ] Make it work for FASTA and BED files.\n* [x] Compare k-mer distribution of two inputs.\n* [ ] Move towards a KMA implementation.\n\n## CLI Example\nFor this example, the first two FASTQ files of \n[SRR396636](https://trace.ncbi.nlm.nih.gov/Traces/sra/?run=SRR396636), corresponding\nto reads from _Pseudomonas aeruginsa MPAO1/P1_, with 1909263 sequences of ~100 bp each, were downloaded.\n\nHaving **ripkmer** installed and in the `$PATH`:\n```bash\nripkmer SRR396636.sra_1.fastq SRR396636.sra_2.fastq\n```\nwhere the `k` number and the `prefix` would be left as default, being equivalent\nto:\n```bash\nripkmer SRR396636.sra_1.fastq SRR396636.sra_2.fastq 16 ATGAC\n```\n\n\nThe output is in tabular format and can be redirected to standard output (and should not take much more than 4s).\n\n    (16-mers)\tUnique\tRedundant\tIntersection_unique\tIntersection\n    SRR396636.sra_1.fastq\t23196\t97871\t34.19%\t58.81%\n    SRR396636.sra_2.fastq\t30698\t89107\t25.83%\t64.59%\n\nwhere\n- **Unique** is the number of unique k-mers found in the file;\n- **Redundant** is the number of total k-mers found (with repetitions);\n- **Interesection_unique** is the number of common unique k-mers found in both files;\n- and **Intersection** is the number of total common k-mers found.\n\n## References\n\n[\u003ca name=\"koster2016\"\u003e1\u003c/a\u003e] Köster, J. (2016). Rust-Bio: a fast and safe bioinformatics library. Bioinformatics, 32(3), 444-446.  \n[\u003ca name=\"kmerfinder2014\"\u003e2\u003c/a\u003e] Benchmarking of Methods for Genomic Taxonomy. Larsen MV, Cosentino S, Lukjancenko O, Saputra D, Rasmussen S, Hasman H, Sicheritz-Pontén T, Aarestrup FM, Ussery DW, Lund O. J Clin Microbiol. 2014 Feb 26","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcarrascomj%2Fripkmer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcarrascomj%2Fripkmer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcarrascomj%2Fripkmer/lists"}