{"id":34180286,"url":"https://github.com/gorpipe/gor","last_synced_at":"2026-04-02T15:39:43.861Z","repository":{"id":37244685,"uuid":"241892835","full_name":"gorpipe/gor","owner":"gorpipe","description":"GORpipe is a tool based on a genomic ordered relational architecture and allows analysis of large sets of genomic and phenotypic tabular data using declarative query language, in a parallel execution engine.","archived":false,"fork":false,"pushed_at":"2026-03-26T18:03:53.000Z","size":17504,"stargazers_count":45,"open_issues_count":6,"forks_count":14,"subscribers_count":4,"default_branch":"main","last_synced_at":"2026-03-26T18:25:58.814Z","etag":null,"topics":["bioinformatics","genetics","genomics","gor","gorpipe","gwas","java","open-source","python","scala","software","spark","vcf"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"agpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gorpipe.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2020-02-20T13:36:52.000Z","updated_at":"2026-03-25T16:43:51.000Z","dependencies_parsed_at":"2023-12-20T13:53:59.157Z","dependency_job_id":"fec5a3fa-7eb3-456d-8e88-b6888bad2932","html_url":"https://github.com/gorpipe/gor","commit_stats":null,"previous_names":[],"tags_count":226,"template":false,"template_full_name":null,"purl":"pkg:github/gorpipe/gor","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gorpipe%2Fgor","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gorpipe%2Fgor/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gorpipe%2Fgor/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gorpipe%2Fgor/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gorpipe","download_url":"https://codeload.github.com/gorpipe/gor/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gorpipe%2Fgor/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31309163,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-02T12:59:32.332Z","status":"ssl_error","status_checked_at":"2026-04-02T12:54:48.875Z","response_time":89,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bioinformatics","genetics","genomics","gor","gorpipe","gwas","java","open-source","python","scala","software","spark","vcf"],"created_at":"2025-12-15T13:56:02.784Z","updated_at":"2026-04-02T15:39:43.838Z","avatar_url":"https://github.com/gorpipe.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"## Introduction\n\nThe GORpipe analysis tool is developed and released by GeneDx. It originates from the pioneers of population based genomic analysis, deCODE genetics, headquartered in Reykjavik, Iceland.\n        \nGORpipe is a tool based on a genomic ordered relational architecture and allows analysis of large sets of genomic and phenotypic tabular data using a declarative query language, in a parallel execution engine.  It is very efficient in a wide range of use-cases, including genome wide batch analysis, range-queries, genomic table joins of variants and segments, filtering, aggregation etc.  The query language combines ideas from SQL and Unix shell pipe syntax, supporting seek-able nested queries, materialized views, and a rich set of commands and functions.  For more information see the paper in Bioinformatics (https://dx.doi.org/10.1093%2Fbioinformatics%2Fbtw199).\n\n\n## Prerequisites\n\nBefore setting up GORpipe you need to have Java JDK or JRE version 11 or higher set up on your computer. \nFor example Open JDK (https://openjdk.java.net/install/). To check your Java version, open up a terminal and enter:\n\n    java -version\n\nAlternatively Oracle distributions can also be used (https://www.oracle.com/java/technologies/javase-downloads.html).\n\n## Getting started with GORpipe\n\nDownload the latest release of GORpipe from https://github.com/gorpipe/gor/releases.\n\nExtract the package (gor-scripts-\\\u003cversion\\\u003e.zip). \n\nThen running GOR without setting up test data can be done by generating GOR rows and then running GOR against that data. For example:\n\n    ./gor-scripts-\u003cversion\u003e/bin/gorpipe \"gor \u003c(gorrows -p chr1:1000-20000 -segment 100 -step 50 | multimap -cartesian \u003c(norrows 100 | group -lis -sc #1))\"  \n\nNote: Substitute \\\u003cversion\\\u003e with the actual latest version number (e.g. gor-scripts-2.9.0). \n\nOptional: A version of GORpipe (GORspark) including an integration with Apache Spark can be setup by downloading the latest \nrelease from https://github.com/gorpipe/gor-spark/releases. This release is much larger (~270MB) than the regular GORpipe release since it contains Apache Spark libraries. \n        \n## Setting up test data (Optional)\n\nDownload the latest released GORpipe test data from https://github.com/gorpipe/gor-test-data/releases.\n\nExtract the package (gor-test-data.zip) and assuming it is located in the same folder as the latest release of GORpipe, \ngor queries can been run against test data via:\n\n     ./gor-scripts-\u003cversion\u003e/bin/gorpipe \"nor gor-test-data/gor/dbsnp_test.gor | top 10\" \n\nThe results should be as follows:\n\n    Chrom   POS     reference       allele  differentrsIDs\n    chr1    10179   C       CC      rs367896724\n    chr1    10250   A       C       rs199706086\n    chr10   60803   T       G       rs536478188\n    chr10   61023   C       G       rs370414480\n    chr11   61248   G       A       rs367559610\n    chr11   66295   C       A       rs61869613\n    chr12   60162   C       G       rs544101329\n    chr12   60545   A       T       rs570991495\n    chr13   19020013        C       T       rs181615907\n    chr13   19020145        G       T       rs28970552\n\nIf the GORspark version was setup the following query should work as well the results the same as above:\n\n    ./gor-scripts-\u003cversion\u003e/bin/gorpipe \"select * from gor-test-data/gor/dbsnp_test.gor limit 10\" \n\n## Setting up an interactive shell for GORpipe\n\nGORpipe can be invoked and used through an interactive shell session in a terminal - a sort of REPL for GOR, coined GORshell. Start a GORshell by executing:\n\n    ./gor-scripts-\u003cversion\u003e/bin/gorshell \n    \nThis will start an interactive shell session where queries can be executed:\n\n    gor gor-test-data/gor/dbsnp_test.gor | top 10\n        \nFor a list of GOR input sources, pipe commands and other details, simply type `help` within the GOR shell.    \n\n## Setting up reference data (Optional)\n\nGo to into the gor-scripts folder (gor-scripts-\\\u003cversion\\\u003e/) and download the reference data found at:\n\n    https://s3.amazonaws.com/wuxinextcode-sm-public/data/standalone-project-data.tar.gz\n    \nIf using a command line, this can be accomplished using `wget`\n\n    wget https://s3.amazonaws.com/wuxinextcode-sm-public/data/standalone-project-data.tar.gz\n\nSince this is a large dataset (~9gb), this download could take a few minutes. After that extract the package via:\n\n    tar -xvf standalone-project-data.tar.gz\n\nand a folder called `ref` should be created.\n\nNote that the file standalone-project-data.tar.gz will remain in the folder. You may want to delete it afterwards. \n    \nTo test the reference data, using aliases, try the following query while being within the gor-scripts folder:\n\n    ./bin/gorpipe \"gor #genes# | top 10\" -aliases config/gor_aliases.txt\n   \nThe results should be as follows:\n\n    Chrom\tgene_start\tgene_end\tGene_Symbol\n    chr1\t11868\t14412\tDDX11L1\n    chr1\t14362\t29806\tWASH7P\n    chr1\t29553\t31109\tMIR1302-10\n    chr1\t34553\t36081\tFAM138A\n    chr1\t52472\t54936\tOR4G4P\n    chr1\t62947\t63887\tOR4G11P\n    chr1\t69090\t70008\tOR4F5\n    chr1\t89294\t133566\tRP11-34P13.7\n    chr1\t89550\t91105\tRP11-34P13.8\n    chr1\t131024\t134836\tCICP27\n\n## Setting up Phecode gwas data (Optional)\n\nGo to into the gor-scripts folder (gor-scripts-\\\u003cversion\\\u003e/) and download the Phecode gwas data found at:\n\n    https://s3.amazonaws.com/wuxinextcode-sm-public/data/standalone-project-phecode-gwas-data.tar.gz\n    \nIf using a command line, this can be accomplished using `wget`\n\n    wget https://s3.amazonaws.com/wuxinextcode-sm-public/data/standalone-project-phecode-gwas-data.tar.gz\n\nSince this is a large dataset (~10gb), this download could take a few minutes. After that extract the package via:\n\n    tar -xvf standalone-project-phecode-gwas-data.tar.gz\n\nand a folder called `phecode_gwas` should be created.\n\nNote that the file standalone-project-phecode-gwas-data.tar.gz will remain in the folder. You may want to delete it afterwards.\n\nTo test this data, the GOR dictionary for the data, `Phecode_adjust_f2.gord` can be queried via for example:\n\n    ./bin/gorpipe \"gor phecode_gwas/Phecode_adjust_f2.gord | top 10\"\n    \nThe results should be as follows:\n\n    CHROM\tPOS\tREF\tALT\tpVal_mm\tOR_mm\tCASE_info\tGC\tQQ\tBONF\tHOLM\tSource\n    chr1\t11008\tC\tG\t7.0e-09\t1.3071212318082575\t11/520/7230\t0.065424\t0.24246\t0.11973\t0.090699\t218.1\n    chr1\t11008\tC\tG\t6.6e-22\t4.444107087017232\t4/60/248\t0.23978\t0.35578\t1.1289e-14\t7.2724e-15\t282.5\n    chr1\t11012\tC\tG\t7.0e-09\t1.3071212318082575\t11/520/7230\t0.065424\t0.24246\t0.11973\t0.090699\t218.1\n    chr1\t11012\tC\tG\t6.6e-22\t4.444107087017232\t4/60/248\t0.23978\t0.35578\t1.1289e-14\t7.2724e-15\t282.5\n    chr1\t13116\tT\tG\t3.7e-10\t0.1992753733291237\t0/9/272\t0.44383\t0.46853\t0.0063285\t0.0033634\t282.5\n    chr1\t13118\tA\tG\t3.7e-10\t0.1992753733291237\t0/9/272\t0.44383\t0.46853\t0.0063285\t0.0033634\t282.5\n    chr1\t13273\tG\tC\t4.9e-08\t0.1156812360571759\t0/3/261\t0.50503\t0.50293\t0.83810\t0.41659\t282.5\n    chr1\t14464\tA\tT\t8.1e-06\t4.3201606068833875\t2/12/23\t1.2298e-10\t1.2424e-05\t1.0000\t1.0000\t362.8\n    chr1\t14464\tA\tT\t7.8e-08\t0.22233506547729875\t0/8/278\t0.51155\t0.50681\t1.0000\t0.65797\t282.5\n    chr1\t15211\tT\tG\t3.9e-15\t0.2678814996572006\t29/38/16\t0.33692\t0.41018\t6.6706e-08\t3.9344e-08\t282.5\n    \n## Setting environment variables\n\nFor convenience, GORpipe and GORshell can be added to path. For example on Mac by editing /etc/paths: \n\n    sudo vim /etc/paths \n    \nand add the following line: \n\n    \u003cPATH_TO_GOR_SCRIPTS\u003e/bin\n\nThen GORpipe and GORshell can be started via `gorpipe` and `gorshell` from any location.\n    \n## Build GORpipe from source\n\nFor developers, to get started with GORpipe, first clone the repo via:\n\n    git clone https://github.com/gorpipe/gor\n\nTest data for GOR is then obtained by cloning the GOR test data repository (https://github.com/gorpipe/gor-test-data) as a submodule into the `tests/data` folder:\n\n    git submodule update --init --recursive\n\nThe code is built via:\n\n    make build\n         \nNow gor queries can been run against test data. For example:\n\n    ./gortools/build/install/gor-scripts/bin/gorpipe \"gor tests/data/gor/dbsnp_test.gor | top 10\"\n\nThe results should be as follows:\n\n    Chrom   POS     reference       allele  differentrsIDs\n    chr1    10179   C       CC      rs367896724\n    chr1    10250   A       C       rs199706086\n    chr10   60803   T       G       rs536478188\n    chr10   61023   C       G       rs370414480\n    chr11   61248   G       A       rs367559610\n    chr11   66295   C       A       rs61869613\n    chr12   60162   C       G       rs544101329\n    chr12   60545   A       T       rs570991495\n    chr13   19020013        C       T       rs181615907\n    chr13   19020145        G       T       rs28970552\n\nGORshell can also be started up via:\n\n    ./gortools/build/install/gor-scripts/bin/gorshell\n\n## Find older packages\n\nPrevious to the migration to gitlab package registry GOR artifacts were stored in jfrog artifactory, the contents of our `libs-release-local` repo can be found in `s3://wxnc-build-artifacts` and credentials for it are in secret server [#3610](https://secretserver.wuxinextcode.com/SecretServer/SecretView.aspx?secretid=3610).\n\n## How to get help?\n\nDocumentation for GORpipe can be found at: http://docs.gorpipe.org/. Additionally, help can be found while using GORpipe\nby executing `gorpipe help` or just `help` within the GOR shell.\n\n## Citations\n        \nIf you make use of GORpipe in your research, we would appreciate a citation to the following paper:\n        \n    GORpipe: a query tool for working with sequence data based on a Genomic Ordered Relational (GOR) architecture\n    Bioinformatics, Volume 32, Issue 20, 15 October 2016, Pages 3081–3088,\n    https://dx.doi.org/10.1093%2Fbioinformatics%2Fbtw199\n\n## License\n\n    GORpipe is free software: you can redistribute it and/or modify\n    it under the terms of the AFFERO GNU General Public License as published by\n    the Free Software Foundation.\n\n    GORpipe is distributed \"AS-IS\" AND WITHOUT ANY WARRANTY OF ANY KIND,\n    INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY,\n    NON-INFRINGEMENT, OR FITNESS FOR A PARTICULAR PURPOSE. See\n    the AFFERO GNU General Public License for the complete license terms.\n\n    You should have received a copy of the AFFERO GNU General Public License\n    along with GORpipe.  If not, see \u003chttp://www.gnu.org/licenses/agpl-3.0.html\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgorpipe%2Fgor","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgorpipe%2Fgor","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgorpipe%2Fgor/lists"}