{"id":18730026,"url":"https://github.com/primeqa/primeqa-orchestrator","last_synced_at":"2025-04-12T17:07:16.073Z","repository":{"id":62171569,"uuid":"551015919","full_name":"primeqa/primeqa-orchestrator","owner":"primeqa","description":"Orchestrator connecting different PrimeQA components","archived":false,"fork":false,"pushed_at":"2023-05-19T12:18:38.000Z","size":1032,"stargazers_count":3,"open_issues_count":2,"forks_count":2,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-12T17:07:03.361Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/primeqa.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-10-13T17:47:02.000Z","updated_at":"2023-02-28T21:49:07.000Z","dependencies_parsed_at":"2024-11-07T14:35:53.994Z","dependency_job_id":"2dc1aa68-7c3e-4d6a-941a-0ac026b2098a","html_url":"https://github.com/primeqa/primeqa-orchestrator","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/primeqa%2Fprimeqa-orchestrator","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/primeqa%2Fprimeqa-orchestrator/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/primeqa%2Fprimeqa-orchestrator/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/primeqa%2Fprimeqa-orchestrator/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/primeqa","download_url":"https://codeload.github.com/primeqa/primeqa-orchestrator/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248602314,"owners_count":21131616,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-07T14:33:31.894Z","updated_at":"2025-04-12T17:07:16.054Z","avatar_url":"https://github.com/primeqa.png","language":"Python","readme":"\u003c!---\nCopyright 2022 PrimeQA Team\n\nLicensed under the Apache License, Version 2.0 (the \"License\");\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\n    http://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an \"AS IS\" BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n--\u003e\n\n\u003c!-- START sphinx doc instructions - DO NOT MODIFY next code, please --\u003e\n\u003cdiv align=\"center\"\u003e\n    \u003cimg src=\"static/PrimeQA.png\" width=\"150\"/\u003e\n\u003c/div\u003e\n\u003c!-- END sphinx doc instructions - DO NOT MODIFY above code, please --\u003e\n\n# Orchestrator REST Microservice\n\nThis toolkit provides an orchestrator microservice that integrates PrimeQA's retriever \u0026 reader modules as a REST Server and also other \"search\" capabilities e.g. IBM Watson Discovery.\n\nHence, using this orchestrator one can either integrate a neural retriever like ColBERT from PrimeQA or external search e.g. IBM Watson Discovery to fetch documents and then use PrimeQA's reader to extract answer spans from those relevant documents.\n\u003cbr\u003e\n\n![Build Status](https://github.com/primeqa/primeqa-orchestrator/actions/workflows/primeqa-orchestrator-ci.yml/badge.svg)\n[![LICENSE|Apache2.0](https://img.shields.io/github/license/saltstack/salt?color=blue)](https://www.apache.org/licenses/LICENSE-2.0.txt)\n\n\u003ch3\u003e✔️ Getting Started\u003c/h3\u003e\n\n- [Repository](https://github.com/primeqa/primeqa-orchestrator)\n\n\u003ch3\u003e✅ Prerequisites\u003c/h3\u003e\n\n- [Python 3.9](https://www.python.org/downloads/)\n\n\u003ch2\u003e⚙️ Setup \u003c/h2\u003e\n\n\u003ch3\u003e📓 Third-party dependencies\u003c/h3\u003e\n\n- [PrimeQA](https://github.com/primeqa/primeqa): If you don't have access to running PrimeQA instance, then please refer to PrimeQA repository for more details on setting and running a local one.\n- [Watson Discovery](https://cloud.ibm.com/) (Optional): Follow instructions on IBM Cloud to configure Watson Discovery V2 service.\n\n\u003ch3\u003e🧩 Setup Local Environment\u003c/h3\u003e\n\n- [Setup and activate a Virtual Environment](https://docs.python.org/3/tutorial/venv.html) (as shown below) or use [Miniconda](https://docs.conda.io/en/latest/miniconda.html)\n\n```shell\n# Install virtualenv\npip3 install virtualenv\n\n# Create a new virtual environment for this project. If using pyenv, path_to_python_3.9_executable will be ~/.pyenv/versions/3.9.x/bin/python\nvirtualenv --python=\u003cpath_to_python_3.9_executable\u003e venv\n\n# Activate virtual environment\nsource venv/bin/activate\n```\n\n- Install dependencies\n\n```shell\npip install -r requirements.txt\npip install -r requirements_test.txt\n```\n\n🐛 `gprcio` and `grpcio-tools` has limited support on Apple Silicone (M1, M2). Please refer to [grpc github issue#25082](https://github.com/grpc/grpc/issues/25082) for details or download appropriate wheels from [here](https://github.com/pietrodn/grpcio-mac-arm-build).\n\n\u003ch3\u003e📜 TLS and Certificate Management\u003c/h3\u003e\n\nOrchestrator service REST server supports mutual or two-way TLS authentication (also known as mTLS). Application's [`config.ini`](orchestrator/service/config/config.ini) file contains the default certificate paths, but they can be overridden using environment variables.\n\nSelf-signed certificates are generated and packaged with the Docker build.\nSelf-signed certs _may be_ required for local development and testing. If you want to generate them, follow the steps below:\n\n```shell\n#!/usr/bin/env bash\n\n# Make neccessary directories\nmkdir -p security/\nmkdir -p security/certs/\nmkdir -p security/certs/ca security/certs/server security/certs/client\n\n# Generate CA key and CA cert\nopenssl req -x509 -days 365 -nodes -newkey rsa:4096 -subj \"/C=US/ST=New York/L=Yorktown Heights/O=IBM/OU=Research/CN=example.com\" -keyout security/certs/ca/ca.key -out security/certs/ca/ca.crt\n\n# Generate Server key (without passphrase) and Server cert signing request\nopenssl req -nodes -new -newkey rsa:4096 -subj \"/C=US/ST=New York/L=Yorktown Heights/O=IBM/OU=Research/CN=example.com\" -keyout security/certs/server/server.key -out security/certs/server/server.csr\n\n# Sign Server cert\nopenssl x509 -req -days 365 -in security/certs/server/server.csr -CA security/certs/ca/ca.crt -CAkey security/certs/ca/ca.key -CAcreateserial -out security/certs/server/server.crt\n\n# Generate Client key (without passphrase) and Client cert signing request\nopenssl req -nodes -new -newkey rsa:4096 -subj \"/C=US/ST=New York/L=Yorktown Heights/O=IBM/OU=Research/CN=example.com\" -keyout security/certs/client/client.key -out security/certs/client/client.csr\n\n# Sign Client cert\nopenssl x509 -req -days 365 -in security/certs/client/client.csr -CA security/certs/ca/ca.crt -CAkey security/certs/ca/ca.key -CAserial security/certs/ca/ca.srl -out security/certs/client/client.crt\n\n# Delete signing requests\nrm -rf security/certs/server/server.csr\nrm -rf security/certs/client/client.csr\n```\n\n**IMPORTANT:**\n\n- By default, the application tries to load certs from `/opt/tls`. You will need to update appropriate `tls_*` variables in [`config.ini`](orchestrator/service/config/config.ini) during local use.\n\n- We recommend to generate certificates with official signing authority and use them via volume mounts in the application container.\n\n\u003ch2\u003e🛠 Build \u0026 Deployment \u003c/h2\u003e\n\n\u003ch3\u003e💻 Local\u003c/h3\u003e\n\n- Open Python IDE \u0026 set the created virtual environment\n- Open `orchestrator/services/config/config.ini`, set `require_ssl = True` (if you wish to use TLS authentication) \u0026 `rest_port`\n- Generate GRPC:\n  ```shell\n  #!/usr/bin/env bash\n  set -xeuo pipefail\n  python -m grpc_tools.protoc -I ./orchestrator/integrations/primeqa/protos --python_out=orchestrator/integrations/primeqa/grpc_generated --grpc_python_out=orchestrator/integrations/primeqa/grpc_generated orchestrator/integrations/primeqa/protos/indexer.proto\n  python -m grpc_tools.protoc -I ./orchestrator/integrations/primeqa/protos --python_out=orchestrator/integrations/primeqa/grpc_generated --grpc_python_out=orchestrator/integrations/primeqa/grpc_generated orchestrator/integrations/primeqa/protos/parameter.proto\n  python -m grpc_tools.protoc -I ./orchestrator/integrations/primeqa/protos --python_out=orchestrator/integrations/primeqa/grpc_generated --grpc_python_out=orchestrator/integrations/primeqa/grpc_generated orchestrator/integrations/primeqa/protos/reader.proto\n  python -m grpc_tools.protoc -I ./orchestrator/integrations/primeqa/protos --python_out=orchestrator/integrations/primeqa/grpc_generated --grpc_python_out=orchestrator/integrations/primeqa/grpc_generated orchestrator/integrations/primeqa/protos/retriever.proto\n  2to3 --fix=import --nobackups --write orchestrator/integrations/primeqa/grpc_generated\n  ```\n- Open `application.py` and run/debug\n- Go to \u003chttp://localhost:{rest_port}/docs\u003e\n- To be able to use `reader`, `indexer` and `retriever` services, be sure you have access to running instance of PrimeQA container\n\n\u003ch3\u003e💻 Docker\u003c/h3\u003e\n\n- Open `config.ini` and set `rest_port`\n- Open `Dockerfile` and set the same value to `port`\n- Run `docker build -f Dockerfile -t primeqa-orchestrator:$(cat VERSION) .` (creates docker image)\n- Run `docker run --rm --name primeqa-orchestrator -d -p \u003cport\u003e:\u003cport\u003e --mount type=bind,source=\"$(pwd)\"/store,target=/store -e STORE_DIR=/store primeqa-orchestrator:$(cat VERSION)` (run docker container)\n- Go to \u003chttp://{Container's public URL}:{rest_port}/docs\u003e\n- To be able to use `reader`, `indexer` and `retriever` services, be sure you have access to running instance of PrimeQA container\n\n\u003ch2\u003e🚨 Configure \u003c/h2\u003e\n\n- Before first use, you will need to specify few neccessary configurations to connect to third-party depedencies. These setting are intentionally left blank for security purposes.\n\n- Go to `STORE_DIR` directory on your local machine and copy the [primeqa.json](./data/primeqa.json) file in that directory.\n\n- You will need to add/update the `settings` portion in `primeqa.json` file. Primarily add `service_endpoint` information (inclusive of port) for `PrimeQA` in `retriever` and `reader` sections in settings.\n\n  a. To use a IBM® Watson Discovery based retriever, add/update `Watson Discovery` add the following to the list in the `retrievers` section.\n\n  ```json\n      \"Watson Discovery\": {\n          \"service_endpoint\": \"\u003cIBM® Watson Discovery Cloud/CP4D Instance Endpoint\u003e\",\n          \"service_api_key\": \"\u003cAPI key (If using IBM® Watson Discovery Cloud instance)\u003e\",\n          \"service_project_id\": \"\u003cIBM® Watson Discovery Project ID\u003e\"\n      }\n  ```\n\n  b. For PrimeQA based retrievers, add/update `PrimeQA` related section in `retrievers` as follows\n\n  ```json\n      \"PrimeQA\": {\n          \"service_endpoint\": \"\u003cPrimeqa Instance Endpoint\u003e:\u003cPort\u003e\"\n      }\n  ```\n\n  c. For PrimeQA based readers, add/update `PrimeQA` related section in `readers` as follows\n\n  ```json\n      \"PrimeQA\": {\n          \"service_endpoint\": \"\u003cPrimeqa Instance Endpoint\u003e:\u003cPort\u003e\",\n          \"beta\": 0.7\n      }\n  ```\n\n  For example, to enable both `IBM® Watson Discovery` instance based retriever and `PrimeQA` based retrievers and `PrimeQA` based reader, the settings will look as follows\n\n  ```json\n  {\n    \"retrievers\": {\n      \"Watson_Discovery\": {\n        \"service_endpoint\": \"\u003cIBM® Watson Discovery CP4D Instance Endpoint\u003e\",\n        \"service_api_key\": \"\u003cAPI key (If using IBM® Watson Discovery Cloud instance)\u003e\",\n        \"service_project_id\": \"\u003cIBM® Watson Discovery Project ID\u003e\"\n      },\n      \"PrimeQA\": {\n        \"service_endpoint\": \"\u003cPrimeqa Instance Endpoint\u003e:\u003cPort\u003e\"\n      }\n    },\n    \"readers\": {\n      \"PrimeQA\": {\n        \"service_endpoint\": \"\u003cPrimeqa Instance Endpoint\u003e:\u003cPort\u003e\",\n        \"beta\": 0.7\n      }\n    }\n  }\n  ```\n\n  NOTE: The final scoring and ranking is done with a weighted sum of the Reader answer scores and Retriever search hits scores. The `beta` field is the weight assigned to the reader scores and `1-beta` is the weight assigned to the retriever scores.\n\n\u003ch3\u003e 🧪 Testing \u003c/h3\u003e\n\n1. To see all available retrievers, execute [GET] `/retrievers` endpoint\n\n```sh\n\tcurl -X 'GET' 'http://{PUBLIC_IP}:50059/retrievers' -H 'accept: application/json'\n```\n\n2. To see all available readers, execute [GET] `/readers` endpoint\n\n```sh\n\tcurl -X 'GET' 'http://{PUBLIC_IP}:50059/readers' -H 'accept: application/json'\n```\n\n\u003ch2\u003e Frequenty Asked Questions (FAQs) \u003c/h2\u003e\n\n\u003ch4\u003e1. How do I get feedbacks to fine tune my reader model? \u003c/h4\u003e\n  \n  ```sh\n    curl -X 'GET' \\\n  'http://localhost:50059/feedbacks?application=reading\u0026application=qa\u0026_format=primeqa' \\\n  -H 'accept: application/json' \u003e feedbacks.json\n  ```\n\n\u003ch4\u003e2. How do I get feedbacks to fine tune my retriever model? \u003c/h4\u003e\n  \n  ```sh\n    curl -X 'GET' \\\n  'http://localhost:50059/feedbacks?application=retrieval\u0026_format=primeqa' \\\n  -H 'accept: application/json' \u003e feedbacks.json\n  ```\n\n\u003c!-- START sphinx doc instructions - DO NOT MODIFY next code, please --\u003e\n\u003c!-- PrimeQA doc sync --\u003e\n\u003ch2\u003e📄 Documentation Sync\u003c/h2\u003e\n\n**Keep PrimeQA documentation reference sync**  \nAnytime this README files is updated, it is necessary to open a PR on PrimeQA repository to update, with the same modifications, **[the associated file](https://github.com/primeqa/primeqa/blob/main/docs/orchestrator.md)** used on [documentation page](https://primeqa.github.io/primeqa/orchestrator.html).  \n_Do not modify initial image path_\n\n\u003c!-- END sphinx doc instructions - DO NOT MODIFY above code, please --\u003e\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprimeqa%2Fprimeqa-orchestrator","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fprimeqa%2Fprimeqa-orchestrator","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprimeqa%2Fprimeqa-orchestrator/lists"}