{"id":33926344,"url":"https://github.com/rustcodepro/sequenceprofiler","last_synced_at":"2025-12-12T10:10:55.578Z","repository":{"id":276206028,"uuid":"928517908","full_name":"rustcodepro/sequenceprofiler","owner":"rustcodepro","description":"kmer based sequence profiler for reads and genomes","archived":false,"fork":false,"pushed_at":"2025-09-29T11:56:26.000Z","size":6892,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-11-29T20:17:22.840Z","etag":null,"topics":["bioinformatics","biological-data-analysis","biological-sequences","genome-analysis","graph-algorithms","graphs","kmers"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rustcodepro.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-02-06T19:03:12.000Z","updated_at":"2025-09-29T11:56:29.000Z","dependencies_parsed_at":"2025-03-11T23:15:42.089Z","dependency_job_id":"2df7e36e-65b7-458c-8f4a-3d360d1da811","html_url":"https://github.com/rustcodepro/sequenceprofiler","commit_stats":null,"previous_names":["sciencegenome/graph-clusters","sciencegenome/graph-kmer","ibchgenomic/sequenceprofiler","sciencegenome/sequenceprofiler","genomicssport/sequenceprofiler","omicscode/sequenceprofiler","rustcodepro/sequenceprofiler"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/rustcodepro/sequenceprofiler","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rustcodepro%2Fsequenceprofiler","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rustcodepro%2Fsequenceprofiler/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rustcodepro%2Fsequenceprofiler/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rustcodepro%2Fsequenceprofiler/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rustcodepro","download_url":"https://codeload.github.com/rustcodepro/sequenceprofiler/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rustcodepro%2Fsequenceprofiler/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":27680590,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-12-12T02:00:06.775Z","response_time":129,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bioinformatics","biological-data-analysis","biological-sequences","genome-analysis","graph-algorithms","graphs","kmers"],"created_at":"2025-12-12T10:10:51.881Z","updated_at":"2025-12-12T10:10:55.572Z","avatar_url":"https://github.com/rustcodepro.png","language":"Rust","readme":"# sequenceprofiler\n\n - This crate has the following features: fasta file should be a linear fasta and not a multi line fasta just like long-read.\n - Sequence, which allows based on the similarity of the shared unique kmers and also allows for the filtering of the sequences so that you can build a native index graph faster.\n - SequenceSeq, which allows for the sequence similarity on a sequence to next iter sequence.\n - longread: finding the origin of the kmers.Back to sequences:Find the origin of 𝑘-mers DOI: 10.21105/joss.07066. Output a table for the direct ingestion into any graphs. Outputs a sam type file with the distinct count of the kmers and can be used for the jellyfish count.Support both the genome and the longread fasta file.\n - Jellyfish: a rust implementation of the jellyfish for the counts.Outputs both the unique counts, all counts.It will produce allkmers, uniquekmers, countkmers\n\n\n ```\n cargo build\n\n ```\n ```\n ___    ___    __ _   _   _    ___   _ __     ___    ___   _ __    _ __    ___    / _| (_) | |   ___   _ __\n/ __|  / _ \\  / _` | | | | |  / _ \\ | '_ \\   / __|  / _ \\ | '_ \\  | '__|  / _ \\  | |_  | | | |  / _ \\ | '__|\n\\__ \\ |  __/ | (_| | | |_| | |  __/ | | | | | (__  |  __/ | |_) | | |    | (_) | |  _| | | | | |  __/ | |\n|___/  \\___|  \\__, |  \\__,_|  \\___| |_| |_|  \\___|  \\___| | .__/  |_|     \\___/  |_|   |_| |_|  \\___| |_|\n                 |_|                                      |_|\n\nsequenceprofiler\n   ************************************************\n   Author Gaurav Sablok,\n   Email: codeprog@icloud.com\n   ************************************************\n\nUsage: sequenceprofiler \u003cCOMMAND\u003e\n\nCommands:\n sequence      identity kmer similarity index\n filter        identity kmer filter\n sequence-seq  compare seq to other seq 1-1 iteration\n jellyfish     jellyfish counter for the long reads\n origin-kmer   finding the origin of kmers\n help          Print this message or the help of the given subcommand(s)\n\nOptions:\n -h, --help     Print help\n -V, --version  Print version\n ```\n - to run the compiled library\n\n ```\nsequenceprofiler sequence ./samplefile/sequence-sample-files/sample.fasta 4 4\nsequenceprofiler filter ./samplefile/sequence-sample-files/sample.fasta 4 10 4\nsequenceprofiler origin-kmer ./samplefile/longread-sample-files/fastafile.fasta 4 4\nsequenceprofiler jellyfish ./samplefile/jellyfish-sample-files/test.fastq 4 4\n```\n\n- To install windows version:\n\n```\nrustup component add llvm-tools\nrustup target add x86_64-pc-windows-msvc\ngit clone https://github.com/IBCHgenomic/ensemblcov.git\ncd ensemblcov\ncargo xwin build --target x86_64-pc-windows-msvc\n```\n\nGaurav Sablok \\\ncodeprog@icloud.com\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frustcodepro%2Fsequenceprofiler","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frustcodepro%2Fsequenceprofiler","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frustcodepro%2Fsequenceprofiler/lists"}