{"id":32631094,"url":"https://github.com/xu-hao/queryarrow","last_synced_at":"2025-10-30T22:56:31.872Z","repository":{"id":34208145,"uuid":"38064992","full_name":"xu-hao/QueryArrow","owner":"xu-hao","description":"A semantically unified SQL and NoSQL query and update system","archived":false,"fork":false,"pushed_at":"2019-01-20T07:50:24.000Z","size":1365,"stargazers_count":17,"open_issues_count":0,"forks_count":1,"subscribers_count":4,"default_branch":"master","last_synced_at":"2023-08-22T08:37:57.289Z","etag":null,"topics":["database","metadata","nosql-databases"],"latest_commit_sha":null,"homepage":"","language":"Haskell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/xu-hao.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2015-06-25T17:56:04.000Z","updated_at":"2023-08-22T08:37:57.289Z","dependencies_parsed_at":"2022-09-04T11:11:27.278Z","dependency_job_id":null,"html_url":"https://github.com/xu-hao/QueryArrow","commit_stats":null,"previous_names":[],"tags_count":0,"template":null,"template_full_name":null,"purl":"pkg:github/xu-hao/QueryArrow","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xu-hao%2FQueryArrow","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xu-hao%2FQueryArrow/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xu-hao%2FQueryArrow/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xu-hao%2FQueryArrow/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/xu-hao","download_url":"https://codeload.github.com/xu-hao/QueryArrow/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xu-hao%2FQueryArrow/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":281896623,"owners_count":26580138,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-30T02:00:06.501Z","response_time":61,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["database","metadata","nosql-databases"],"created_at":"2025-10-30T22:56:06.592Z","updated_at":"2025-10-30T22:56:31.866Z","avatar_url":"https://github.com/xu-hao.png","language":"Haskell","funding_links":[],"categories":[],"sub_categories":[],"readme":"# QueryArrow\n\n## Resources\nHao Xu, Ben Keller, Antoine de Torcy, Jason Coposky (2016) QueryArrow: Bidirectional Integration of Multiple Metadata Sources. 8th iRODS User Group Meeting, University of North Carolina at Chapel Hill. June 2016.\n\nSlides: https://irods.org/uploads/2016/06/queryarrow2_Hao-Xu_DICE_iRODS-UGM-2016.pdf\nhttps://irods.org/uploads/2017/Xu-RENCI-QueryArrow-slides.pdf\n\nTechnical Report: https://irods.org/uploads/2015/01/xu-queryarrow-2016.pdf\n\nPaper: https://irods.org/uploads/2017/Xu-RENCI-QueryArrow-paper.pdf\n\nSpecification: https://github.com/xu-hao/CertifiedQueryArrow\n\n## Introduction\n\nQueryArrow is motivated by the following applications: bidirectional metadata integration from different metadata sources, metadata policies, metadata migration from different databases, metadata indexing.\n\nQueryArrow provides a systematic solution to shared namespace and unshared namespace federation of metadata. In particular, QueryArrow allows querying multiple multiple data sources including NoSQL databases, and updating data sources. For data sources that support two-phase commit, QueryArrow also supports distributed transactions. QueryArrow also enables poly-fill for features that the underlying database does not support.\n\nA QueryArrow instance includes a QueryArrow Service and QueryArrow plugins (QAPs). Each plugin provides interface with one data store.\nThe queries are issued from the client in the QueryArrow Language (QAL). QAL is a unified query and update language for SQL and NoSQL databases.\n\n## Build docker image\n\n### build bases\n```\ncd docker/build\ndocker build -t queryarrow-stack-build:0.1.0 .\n```\n\n```\ncd docker/run\ndocker build -t queryarrow-stack-run:0.1.0 .\n```\n\nIf you use customized bases from Internet, then\n```\nstack docker pull\n```\n\n### build\n```\nstack build\n```\n\n### create image\n```\nstack image container\n```\n\n## How to build (deprecated)\n\n### Ubuntu 16.04 and CentOS 7\n\n#### Install GHC 8.0.2\n\nFrom source:\n\nhttps://www.haskell.org/ghc/download_ghc_8_0_2#sources\n\nFind out where ghc is installed.\n\nIf built from source, it is `\u003cprefix\u003e/lib/ghc-8.0.2/`. Default `\u003cprefix\u003e` is `/usr/local`.\n\n#### Install Stack\n\nFollow the instructions on this page:\n\nhttps://docs.haskellstack.org\n\nIf you install from system repo, make sure you run\n\n    stack upgrade\n\n#### Install Packages\n\nOn `Ubuntu`\n\n    apt-get install postgresql-server-dev-all libsqlite3-dev -y\n\nOn `CentOS`\n\n    yum install postgresql-devel sqlite3-devel -y\n\n#### Build QueryArrow\n\n    git clone http://github.com/xu-hao/QueryArrow\n\n    cd QueryArrow\n\n    stack build\n\n#### Create QueryArrow Package\n\n    cd ..\n\nMake a new directory\n\n    mkdir build\n\n    cd build\n\n    ../QueryArrow/find_dependencies.sh ../QueryArrow\n\n    cpack --config CPackConfig.cmake\n\n#### Install QueryArrow Package\n\nOn `Ubuntu`\n\n    dpkg -i queryarrow-0.2-Linux-amd64.deb\n\nOn `CentOS`\n\n    yum install queryarrow-0.2-Linux-amd64.rpm\n\n## QueryArrow Configuration\n\nBy default QueryArrow Configuration files are stored in the `/etc/QueryArrow/tdb-plugin-gen-abs.yaml` file.\n\nAn example is\n\n~~~yaml\ndb_plugin:\n  qap_name: cache\n  catalog_database_type: Cache\n  db_config:\n    max_cc: 1024\n    cache_db_plugin:\n      qap_name: trans\n      catalog_database_type: Translation\n      db_config:\n        rewriting_file_path: \"../QueryArrow-gen/rewriting-plugin-gen.rules\"\n        include_file_path:\n        - \"../QueryArrow-gen\"\n        trans_db_plugin:\n          qap_name: sum\n          catalog_database_type: Sum\n          db_config:\n            summands:\n            - qap_name: ICAT\n              catalog_database_type: SQL/HDBC/PostgreSQL\n              db_config:\n                db_port: 5432\n                db_name: ICAT\n                db_password: testpassword\n                db_host: localhost\n                db_username: irods\n                db_predicates: \"../QueryArrow-gen/gen/ICATGen\"\n                db_namespace: ICAT\n                db_sql_mapping: \"../QueryArrow-gen/gen/SQL/ICATGen\"\n            - qap_name: ''\n              catalog_database_type: FileSystem\n              db_config:\n                fs_port: 0\n                db_namespace: FileSystem\n                fs_host: ''\n                fs_root: \"/tmp\"\n                fs_hostmap:\n                - - ''\n                  - 0\n                  - \"/tmp\"\n            - qap_name: ''\n              catalog_database_type: InMemory/BuiltIn\n              db_config:\n                db_namespace: BuiltIn\nservers:\n- server_protocol: service/tcp\n  server_config:\n    tcp_server_addr: \"*\"\n    tcp_server_port: 12345\n~~~\n\n\n## QueryArrow Plugin\n\nCurrently, the implemented QAPs include\n\n|        Name       |           Description          | `db_config`       |\n|:-----------------:|:------------------------------:|:------------:|\n|      Sum      |           aggregation          |`summands`, `db_namespace`|\n|  Translation  |         policy support         |`rewriting_file_path`, `include_file_path`,`trans_db_plugin`|\n|     Cache     |             caching            |`max_cc`, `cache_db_plugin`|\n|      Remote/TCP   |            remoting            |`db_host`, `db_port`|\n| FileSystem | interfacing with file system | `fs_host`, `fs_port`, `fs_root`, `fs_hostmap`, `db_namespace`|\n| ElasticSearch/ElasticSearch | interfacing with ElasticSearch | `db_name`, `db_namespace`, `db_predicates`, `db_sql_mapping`, `db_host`, `db_port`, `db_username`, `db_password`|\n|     Cypher/Neo4j     |     interfacing with Neo4j     |`db_namespace`, `db_predicates`, `db_sql_mapping`, `db_host`, `db_port`, `db_username`, `db_password`|\n|   SQL/HDBC/PostgreSQL  |    interfacing with Postgres   |`db_name`, `db_namespace`, `db_predicates`, `db_sql_mapping`, `db_host`, `db_port`, `db_username`, `db_password`|\n|    SQL/HDBC/SQLite3    |    interfacing with SQLite3    |`db_file_path`, `db_namespace`, `db_predicates`, `db_sql_mapping`|\n|  SQL/HDBC/CockroachDB   |  interfacing with CockroachDB  |`db_name`, `db_namespace`, `db_predicates`, `db_sql_mapping`, `db_host`, `db_port`, `db_username`, `db_password`|\n|  Include   |  include other config files  |`include`|\n|  InMemory/StateMap  |      in-memory mutable map     |`db_namespace`,`predicate_name`,`db_map`|\n| InMemory/Map |     in-memory immutable map    |`db_namespace`,`predicate_name`,`db_map`|\n|       InMemory/BuiltIn      |   built-in predicates: `like_regex`, `not_like_regex`, `eq`, `ne`, `le`, `ge`, `lt`, `gt`, `concat`, `substr`, `strlen`, `add`, `sub`, `mul`, `div`, `mod`, `exp`, `like`, `not_like`, `in`, `replace`, `regex_replace`, `sleep`, `encode`     |`db_namespace`|\n\n\n## QueryArrow CLI\n\nQueryArrow provides a CLI command `QueryArrow`.\n\n## QueryArrow Server\n\nQueryArrow provides server protocols. The service protocol is used for clients to communicate with QueryArrow. The remote protocol is used by the Remote QAP. The file system protocol is used by the FileSystem QAP.\n\n|       |unix domain socket|tcp socket|http|\n|:-----:|:----------------:|:--------:|:--:|\n|service| implemented | implemented | implemented |\n|remote| implemented | implemented| |\n|file system|  | implemented | |\n\n|  Service     |`server_config`|\n|:-----:|:----------------:|\n|service/tcp, remote/tcp | `tcp_server_addr`, `tcp_server_port` |\n|service/http|`http_server_port`|\n|service/unix domain socket, remote/unix domain socket| `uds_server_addr` |\n|file system/tcp| `fs_server_port`, `fs_server_root`, `fs_server_addr`|\n\nThe command for starting the server is `QueryArrowServer`.\n\n## iRODS Database Plugin\n\nIn addition to running QueryArrow as a standalone service, QueryArrow provides an iRODS database plugin. To build it, follow instructions for building iRODS database plugins.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxu-hao%2Fqueryarrow","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fxu-hao%2Fqueryarrow","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxu-hao%2Fqueryarrow/lists"}