https://github.com/thomassuedbroecker/simple-qa-pipeline
This project contains a simple example implementation for a simple question-answering pipeline using inside-search (IBM Cloud Watson Discovery) and prompt (IBM watsonx with prompt-lab).
https://github.com/thomassuedbroecker/simple-qa-pipeline
bash prompt python watson-discovery watsonx
Last synced: about 1 month ago
JSON representation
This project contains a simple example implementation for a simple question-answering pipeline using inside-search (IBM Cloud Watson Discovery) and prompt (IBM watsonx with prompt-lab).
- Host: GitHub
- URL: https://github.com/thomassuedbroecker/simple-qa-pipeline
- Owner: thomassuedbroecker
- License: apache-2.0
- Created: 2023-08-05T17:13:06.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2024-02-21T15:49:40.000Z (over 2 years ago)
- Last Synced: 2025-07-21T06:50:18.687Z (11 months ago)
- Topics: bash, prompt, python, watson-discovery, watsonx
- Language: Python
- Homepage:
- Size: 8.22 MB
- Stars: 3
- Watchers: 1
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Simple `question-answering pipeline` using inside search and prompt with a Large Langauge Model.
This project contains a simple example implementation for a simple `question-answering pipeline` using an `inside-search` (`IBM Cloud Watson Discovery`) and a `prompt` (`IBM watsonx` with `prompt-lab`) to create an answer.
Related blog post: [IBM watsonx and a simple question-answering pipeline using Python and FastAPI](https://wp.me/paelj4-21Y)
## 1. Objective
The objective is to implement an elementary `question-answering pipeline` example by showing how to consume existing REST APIs and create a REST API with [`FastAPI`](https://github.com/tiangolo/FastAPI) and `Python` because `Python` is well-known in the AI world.
>Note: An excellent and detailed example implementation for a `question-answering pipeline` in the [`question-answering project`](https://github.com/nheidloff/question-answering). That project contains many more details and integrations; the `question-answering pipeline` is implemented in Java.
The project also provides an example implementation for an experiment execution for the question-answering pipeline the service for the execution is called `experiment-runner` and is implemented in Python. [Niklas Heidloff](https://heidloff.net) has written many awesome blog posts about AI and this [`question-answering project`](https://github.com/nheidloff/question-answering). I recommend briefly looking at the related blog posts to this project.
## 2. Simplified Architecture
The `simple-qa-pipeline` creates an answer to a question using a [Large Language Model]() inside [watsonx with the Prompt lab](https://dataplatform.cloud.ibm.com/docs/content/wsj/getting-started/welcome-main.html?context=wx&audience=wdp) and it searches for documents with Watson Discovery to provide the context.

## 3. Run the `simple-qa-pipeline` locally
### 3.1. Get the source code and create a virtual Python environment
* Clone project
```
git clone https://github.com/thomassuedbroecker/simple-pipeline.git
```
* Create a virtual Python environment
```sh
cd simple-pipeline/code
python3 -m venv simple-pipeline-env-3
source ./simple-pipeline-env-3/bin/activate
```
* Install needed Python libs and create a `requirements.txt` file
```sh
python3 -m pip install --upgrade pip
python3 -m pip install "fastapi[all]"
python3 -m pip install requests
python3 -m pip install pydantic
python3 -m pip install torch
python3 -m pip install accelerate
python3 -m pip install typing
python3 -m pip install transformers
#python3 -m pip install git+https://github.com/huggingface/transformers
```
* Save your configuration in requirements.txt
```sh
python3 -m pip install --upgrade pip
python3 -m pip freeze > requirements.txt
deactivate
```
* Install from configuration from requirements.txt
```sh
python3 -m pip install -r requirements.txt
```
### 3.2. Configure the `simple-qa-pipeline` to access the needed REST API by using environment variables
* Create `.env` file
```sh
cat ./.env-template > ./.env
```
* Outline of the environment file
```sh
# Discovery
export DISCOVERY_API_KEY=
export DISCOVERY_URL=https://api.us-east.discovery.watson.cloud.ibm.com/instances/
export DISCOVERY_COLLECTION_ID=
export DISCOVERY_PROJECT=
export DISCOVERY_INSTANCE=
# watsonx
export WATSONX_URL="https://us-south.ml.cloud.ibm.com/ml/v1-beta/generation/text"
export WATSONX_LLM_NAME=google/flan-ul2
export WATSONX_MIN_NEW_TOKENS=1
export WATSONX_MAX_NEW_TOKENS=300
export WATSONX_PROMPT="Document:\n\n<>\n\nQuestion:\n\n<>\n\nAnswer:\n\n"
export WATSONX_PROJECT_ID=
export WATSONX_VERSION="2023-05-29"
# Watsonx deployment (Watson Studio Deployment)
export WATSONX_DEPLOYMENT_ID=
export WATSONX_DEPLOYMENT_URL=https://[XXX]${WATSONX_DEPLOYMENT_ID}[YYY]
export WATSONX_DEPLOYMENT_VERSION=2021-05-01
# Custom Model
export CUSTOM_MODEL_PROMPT="Code:\n\n<>\n\nQuestion:\n\n<>\n\nAnswer:\n\n"
# IBM Cloud
export IBMCLOUD_APIKEY=
# APP
export APP_USER=admin
export APP_APIKEY=admin
export APPLOG=INFO
```
### 3.3 Run `simple-qa-pipeline` server
```sh
cd code
source ./simple-pipeline-env-3/bin/activate
source ./.env
python3 simple-qa-pipeline.py
```
* Open Swagger UI with the API documentation
```sh
open http://localhost:8081/docs
```

### 3.4. Run a small test
* Question: `"What is your name?"`
* Context: `"My name is Thomas."`
* Invoke the REST API endpoint `get_simple_answer` in the Swagger UI with the given values.
* Open the `watsonx Prompt lab` and insert the following prompt.
```
Document:
My name is Thomas.
Question:
What is your name?
Answer:
```
The following gif shows how the simple test works.

### 3.5. Setup the related IBM Cloud instance
This example uses [IBM Cloud](https://cloud.ibm.com/).
### 3.5.1 Watson Discovery service instance
There is no [`Lite plan`](https://www.ibm.com/cloud/free) available, but when you create a new IBM Cloud Account, you a [`free trial period`](https://www.ibm.com/products/watson-discovery), you can use.
* [Create a Watson Discovery instance](https://cloud.ibm.com/docs/discovery-data?topic=discovery-data-getting-started)
* Extract the project ID and collection ID
* If you need additional help, you can visit my blog post [Show the collection IDs of IBM Cloud Watson Discovery projects using cURL](https://suedbroecker.net/2023/05/12/show-the-collection-ids-of-ibm-cloud-watson-discovery-projects-using-curl/)
### 3.5.2 watsonx instance
1. Visit the [watsonx link](https://www.ibm.com/watsonx) and get a free trial.
2. A `Sandbox project` will be created for you called `Sandbox`
3. The watsonx documentation is available on [IBM Cloud](https://dataplatform.cloud.ibm.com/docs/content/wsj/getting-started/welcome-main.html?context=wx&audience=wdp)
4. [Open Prompt lab](https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/fm-prompt-lab.html?context=wx&audience=wdp)
5. Open view code

6. [Create an IBM Cloud API key](https://www.ibm.com/docs/en/app-connect/container?topic=servers-creating-cloud-api-key)
)
* With watsonx sandbox project creation following services will be instantiated in your IBM Cloud Account:
* [Watson Studio](https://www.ibm.com/products/watson-studio)
* [Watson Machine Learning](https://cloud.ibm.com/catalog/services/watson-machine-learning)
* [Cloud Object Storage](https://cloud.ibm.com/docs/cloud-object-storage?topic=cloud-object-storage-getting-started-cloud-object-storage)
The gif below shows how you can access watsonx from your Watson Studion in your IBM Cloud Account.

* Simplified dependencies of the created Watonx environment

## 4. Useful tools
* [`curl-converter`](https://curlconverter.com/)
## 5. Summary
With [`FastAPI`](https://github.com/tiangolo/fastapi) and `Python` was easy and fasted to implement the `simple-qa-pipeline`.
With the automated created Swagger documentation, manually testing the `REST API` for the `simple-qa-pipeline` was easy.
We can download the OpenAPI spec directly and use its `REST API` in other integration scenarios like BYOS with Watson Assistant.
The good REST API documentation from IBM Cloud and watsonx made it easy to use them even without the SDK.
## 6. Additional resources
* [VS Code Python debugging](https://code.visualstudio.com/docs/python/debugging)
* [OpenAPI Swagger `array of strings`](https://stackoverflow.com/questions/39281532/specify-an-array-of-strings-as-body-parameter-in-swagger-api)
* [OpenAPI Swagger `array of objects`](https://stackoverflow.com/questions/63738715/how-to-define-an-array-of-objects-in-openapi-3-0)
* [OpenAPI Swagger `authentication`](https://swagger.io/docs/specification/authentication/)
* [OpenAPI Swagger `apikey authentication`](https://github.com/watson-developer-cloud/assistant-toolkit/blob/master/integrations/extensions/starter-kits/language-model-watsonx/watsonx-openapi.json#L14)
* [Required properties](https://stackoverflow.com/questions/40113049/how-to-specify-if-a-field-is-optional-or-required-in-openapi-swagger)