{"id":15580606,"url":"https://github.com/sammyjava/pangenomics","last_synced_at":"2025-02-26T16:23:58.428Z","repository":{"id":81863256,"uuid":"267865038","full_name":"sammyjava/pangenomics","owner":"sammyjava","description":"Java code writtten for pangenomics work","archived":false,"fork":false,"pushed_at":"2022-01-05T13:53:14.000Z","size":5783,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-01-09T08:53:48.199Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sammyjava.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-05-29T13:30:03.000Z","updated_at":"2022-01-05T13:53:17.000Z","dependencies_parsed_at":null,"dependency_job_id":"be7c5b04-7613-4f61-b191-e3e0af0032c6","html_url":"https://github.com/sammyjava/pangenomics","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sammyjava%2Fpangenomics","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sammyjava%2Fpangenomics/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sammyjava%2Fpangenomics/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sammyjava%2Fpangenomics/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sammyjava","download_url":"https://codeload.github.com/sammyjava/pangenomics/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240889272,"owners_count":19873805,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-02T19:31:41.093Z","updated_at":"2025-02-26T16:23:58.402Z","avatar_url":"https://github.com/sammyjava.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"This directory contains classes for working with pan-genomic graphs and frequented regions, based on the paper\n```\nCleary, et al., \"Exploring Frequented Regions in Pan-Genomic Graphs\", IEEE/ACM Trans Comput Biol Bioinform. 2018 Aug 9. PMID:30106690 DOI:10.1109/TCBB.2018.2864564\n``` \nThis work was funded in part by the National Center for Genome Resources, Santa Fe, NM.\n\n## Building\nThe project is set up with dependencies managed with the [Gradle build tool](https://gradle.org/). To build the distribution, simply run\n```\n$ ./gradlew installDist\n```\nThis will create a distribution under `build/install` that is used by the various run scripts.\n\n### org.ncgr.pangenomics\nThis contains two packages with similarly-named classes:\n\n**org.ncgr.pangenomics.allele** which contains classes for working with allele-based sequence graphs\n**org.ncgr.pangenomics.genotype** which contains classes for working with genotype graphs\n\nBasic graph-related classes, not particularly specific to frequented regions:\n\n`PangenomicGraph` extends org.jgrapht.graph.DirectedAcyclicGraph and stores a graph with methods for reading it in from files and various output methods.\nThere is a `main` class for creating a graph from input data such as a GFA or VCF file.\n\n`Node` encapsulates a node in a Graph: its ID (a long) and, for sequence graphs, its sequence.\n\n`NodeSet` encapsulates a set of nodes in a Graph. NodeSet implements Comparable. There is a method `merge()` for merging two NodeSets.\n(These are called \"node clusters\" in the paper above, but since I've implemented it as an extension of TreeSet, I've used \"Set\").\n\n`Path` encapsulates a path through a Graph, along with its full sequence in the case of sequence graphs.\n\n### org.ncgr.pangenomics.[allele/genotype].fr\nFrequented regions-related code.\n\n`FrequentedRegion` stores a FrequentedRegion, containing a NodeSet along with the supporting subpaths of the full set of Paths in a Graph, and lots of methods.\n\n`FRFinder` contains a `main()` method for finding FRs based on a bunch of parameters.\n\n`FRPair` is a utility class that contains two FRs and the result of merging them, and is used in the search loop in `FRFinder.findFRs()`.\n\n### org.ncgr.svm\nLIBSVM-based Support Vector Machine classes.\n\n### org.ncgr.weka\nWeka-based supervised classifier classes.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsammyjava%2Fpangenomics","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsammyjava%2Fpangenomics","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsammyjava%2Fpangenomics/lists"}