{"id":21773489,"url":"https://github.com/urbanslug/junctions","last_synced_at":"2025-10-30T06:01:11.674Z","repository":{"id":74592480,"uuid":"539316173","full_name":"urbanslug/junctions","owner":"urbanslug","description":"Compare pangenomes using ED-Strings","archived":false,"fork":false,"pushed_at":"2024-07-17T06:35:02.000Z","size":3571,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2024-07-17T08:37:24.834Z","etag":null,"topics":["elastic-degenerate-string","pangenomics"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/urbanslug.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-09-21T05:15:15.000Z","updated_at":"2024-07-17T06:35:05.000Z","dependencies_parsed_at":null,"dependency_job_id":"e4f7a318-bcbb-467e-b342-ac65552af4dd","html_url":"https://github.com/urbanslug/junctions","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/urbanslug%2Fjunctions","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/urbanslug%2Fjunctions/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/urbanslug%2Fjunctions/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/urbanslug%2Fjunctions/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/urbanslug","download_url":"https://codeload.github.com/urbanslug/junctions/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":226584451,"owners_count":17655036,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["elastic-degenerate-string","pangenomics"],"created_at":"2024-11-26T17:01:22.186Z","updated_at":"2025-10-30T06:01:11.526Z","avatar_url":"https://github.com/urbanslug.png","language":"C++","readme":"# Pangenome comparison via ED strings\n\nThis is a software suite for pangenome comparison via elastic-degenerate (ED) strings.\n\nAn ED string is a sequence of sets of strings. Our software currently supports only the DNA alphabet `{A, T, C, G}`, the letter `N` for the indeterminate base, as well as the empty string `Ɛ`. For example, the ED string x below has 3 sets. The first set has two strings (`AGT` and `A`); the second set has three strings (`A`, `T`, and the empty string `Ɛ`), and the third set has one string (`ACGTN`).\n\n```\n$ cat x.eds\n{AGT,A}{A,T,}{ACGTN}\n```\n\n## Download the source code\n\nUsing `git`\n```sh\n$ git clone https://github.com/urbanslug/junctions\n$ cd junctions\n```\n\nor using `curl` and `zip`:\n```\n$ curl -LO https://github.com/urbanslug/junctions/archive/refs/heads/master.zip\n$ unzip master.zip\n$ cd junctions\n```\n\n## Compile\n\nCompilation is done with `make` and can be done in different ways.\n\nTo create a dynamically linked binary (advisable)\n```\n$ make\n```\n\nin case of need of statically linked binary\n```\n$ make static\n```\n\nWhen compiled with MSA support junctions is able to internally convert MSA \nfiles in RC-MSA format to ED strings however this is only supported on newer \nx86 processors.\n\nTo compile a dynamically linked binary\n```\n$ make WITH_MSA=true\n```\n\nor for a statically linked binary\n```\n$ make static WITH_MSA=true\n```\n\n## Usage and Documentation\nRun the following for the help text\n\n```\n$ ./bin/junctions\n```\n\nFurther documentation can be found in the [wiki](https://github.com/urbanslug/junctions/wiki).\n\n\n## Example 1: ED string intersection\nConsider two ED strings x and y encoded in the corresponding files below:\n\n```\n$ cat x.eds \n{A,AC,TGCT}{CA,}\n```\n\n```\n$ cat y.eds \n{,T}{GCA,AC}\n```\n\nWe can determine whether x and y have a nonempty intersection by running the following:\n\n```\n$ ./bin/junctions intersect x.eds y.eds \nINFO intersection exists\n```\nIndeed, x and y share the string `AC`.\n\n## Example 2: ED Matching Statistics\nConsider two ED strings x and y encoded in the corresponding files below:\n\n```\n$ cat x.eds \n{A,AC,TGCT}{CA,}\n```\n\n```\n$ cat y.eds \n{,T}{GCA,AC}\n```\nWe can compute their matching statistics by running the following:\n\n```\n$ ./bin/junctions graph -c 0 -s x.eds y.eds \nSimilarity measure is: 5\n```\n\nIn particular, the option `-c 0` denotes that no constraint is imposed to the\nmatching statistics; the option `-s` denotes that a similarity measure will be\ncomputed from the matching statistics.\n\n## Example 3: Breakpoint Matching Statistics\nIn this flavour, matching statistics are considered only between breakpoints.\nConsider two ED strings x and y encoded in the corresponding files below:\n\n```\n$ cat x.eds \n{A,AC,TGCT}{CA,}\n```\n\n```\n$ cat y.eds \n{,T}{GCA,AC}\n\n```\nWe can compute their breakpoint matching statistics by running the following:\n\n```\n$ ./bin/junctions graph -c 1 -s x.eds y.eds \nSimilarity measure is: 3.5\n```\n\nIn particular, the option `-c 1` denotes that a constraint related to breakpoints \nis imposed to the matching statistics; the option `-s` denotes that a similarity \nmeasure will be computed from the matching statistics.\n\n## Citations\n\n```\nEstéban Gabory, Njagi Moses Mwaniki, Nadia Pisanti, Solon P. Pissis, Jakub Radoszewski, Michelle Sweering, Wiktor Zuba:\nComparing Elastic-Degenerate Strings: Algorithms, Lower Bounds, and Applications. CPM 2023.\n\n\nEstéban Gabory, Njagi Moses Mwaniki, Nadia Pisanti, Solon P. Pissis, Jakub Radoszewski, Michelle Sweering, Wiktor Zuba:\nPangenome Comparison via ED Strings. [Front. Bioinform.](https://doi.org/10.3389/fbinf.2024.1397036)\n\n```\n","funding_links":[],"categories":["A list of software capable of analyzing mainly **eukaryotic** genomes for pangenomics."],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Furbanslug%2Fjunctions","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Furbanslug%2Fjunctions","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Furbanslug%2Fjunctions/lists"}