{"id":18448420,"url":"https://github.com/nextomics/grandstr","last_synced_at":"2025-06-15T03:12:09.473Z","repository":{"id":140454442,"uuid":"372712567","full_name":"Nextomics/GrandSTR","owner":"Nextomics","description":null,"archived":false,"fork":false,"pushed_at":"2021-06-22T05:40:57.000Z","size":1931,"stargazers_count":4,"open_issues_count":0,"forks_count":0,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-04-16T01:51:01.640Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Nextomics.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-06-01T05:35:03.000Z","updated_at":"2024-05-23T13:23:37.000Z","dependencies_parsed_at":null,"dependency_job_id":"c8a80f26-2db1-40f0-9918-ac41c454345f","html_url":"https://github.com/Nextomics/GrandSTR","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Nextomics/GrandSTR","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Nextomics%2FGrandSTR","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Nextomics%2FGrandSTR/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Nextomics%2FGrandSTR/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Nextomics%2FGrandSTR/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Nextomics","download_url":"https://codeload.github.com/Nextomics/GrandSTR/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Nextomics%2FGrandSTR/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259914929,"owners_count":22931331,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-06T07:15:50.493Z","updated_at":"2025-06-15T03:12:09.436Z","avatar_url":"https://github.com/Nextomics.png","language":"Shell","readme":"# GrandSTR\n\nEstimation of repeat counts of short tandem repeats(STR) from long-read sequencing data. Get genotypes for known STR.\n\n\n## Dependencies\n\nPython packages:\n- python: 3.6 or higher\n- pysam: 0.16.0 or higher\n- sklearn: 0.24.1 or higher\n- hmmlearn: 0.2.5 or higher\n- edlib: 1.2.6 or higher, we need edlib.so file for python bindings.\n\nDependencies for install:\n- cython: 0.29.21 or higher\n\n\n## Install\n\nTo build align.so, GrandSTR_lib.so, utils_lib.so, deal_str.so, run:\n\n```bash\npython setup.py build_ext -i\n```\n\n\n## Usage \u0026 Examples\n\n### Input files\n- Input bam file is alignment file of read sequences aligned to reference sequences, which is typically generated by minimap2 (versoin 2.17) with parameters \"-ax asm10 --MD -Y -L --secondary=no\" for hifi reads, or \"-ax map-ont --MD -Y -L --secondary=no\" for ONT reads. For example:\n```bash\nminimap2 -t 16 -ax asm10 --MD -Y -L --secondary=no hg19.fasta hifi.fastq 2\u003e align.log | samtools view -Sb - | samtools sort - -o hifi.sorted.bam\n```\n\n- Input pa file is comma seperated information file, including coordinates of STR regions in reference, and repeat unit sequence. The required columns include STR name, chromosome, start coordinate, end coordinate, and repeat unit sequence. The left columns are optional. For example:\n```\nSTR000009,1,691243,691307,CACCC,0,,downstream,LOC100288069\n```\n\n- Input fasta file is reference genome fasta file. \n\n### Commands\n- For small amount of input STRs provided in pa file, add \"-em 0\" parameter to GrandSTR program. For example:\n```bash\ncd test/\nsamtools index hifi.sorted.bam\nsamtools faidx hg19.fasta\n../GrandSTR test1.pa out1 -rf hg19.fasta -bf hifi.sorted.bam -em 0 -rt hifi\n```\n\n- For large amount of input STRs provided in pa file, add \"-em 1\" parameter to GrandSTR program. For example:\n```bash\ncd test/\nsamtools index hifi.sorted.bam\n../GrandSTR test1.pa out1 -rf hg19.fasta -bf hifi.sorted.bam -em 1 -rt hifi\n```\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnextomics%2Fgrandstr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnextomics%2Fgrandstr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnextomics%2Fgrandstr/lists"}