{"id":15165806,"url":"https://github.com/mauroandretta/wikirag","last_synced_at":"2025-10-23T17:49:43.754Z","repository":{"id":254806947,"uuid":"846425089","full_name":"MauroAndretta/WikiRag","owner":"MauroAndretta","description":"WikiRag is a Retrieval-Augmented Generation (RAG) system designed for question answering, it reduces hallucination thanks to the RAG architecture. It leverages Wikipedia content as a knowledge base.","archived":false,"fork":false,"pushed_at":"2024-08-27T14:24:07.000Z","size":1180,"stargazers_count":5,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-07T06:13:27.719Z","etag":null,"topics":["ai","genai-chatbot","langchain","llama3","ml","qdrant-vector-database","rag"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/MauroAndretta.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-08-23T07:17:43.000Z","updated_at":"2025-03-29T16:02:11.000Z","dependencies_parsed_at":"2024-08-27T15:06:41.509Z","dependency_job_id":null,"html_url":"https://github.com/MauroAndretta/WikiRag","commit_stats":{"total_commits":12,"total_committers":2,"mean_commits":6.0,"dds":0.08333333333333337,"last_synced_commit":"b5bcb48ffb45416b639e1a3bc98d49b93ead88a9"},"previous_names":["mauroandretta/wikirag"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/MauroAndretta/WikiRag","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MauroAndretta%2FWikiRag","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MauroAndretta%2FWikiRag/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MauroAndretta%2FWikiRag/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MauroAndretta%2FWikiRag/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/MauroAndretta","download_url":"https://codeload.github.com/MauroAndretta/WikiRag/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MauroAndretta%2FWikiRag/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":269683186,"owners_count":24458628,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-10T02:00:08.965Z","response_time":71,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","genai-chatbot","langchain","llama3","ml","qdrant-vector-database","rag"],"created_at":"2024-09-27T04:03:05.013Z","updated_at":"2025-10-23T17:49:38.714Z","avatar_url":"https://github.com/MauroAndretta.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# WikiRag\n\n## Overview\n\nWikiRag is a Retrieval-Augmented Generation (RAG) system designed for question answering, it reduces hallucination thanks to the RAG architecture. It leverages Wikipedia content as a knowledge base.\n\n![WikiRag](images/wikirag.gif)\n\n## Table of Contents\n- [Code Directory](#code-directory)\n- [WikiRag: Conversational RAG with Wikipedia Knowledge Base](#wikirag-conversational-rag-with-wikipedia-knowledge-base)\n  - [How It Works](#how-it-works)\n  - [Example Usage](#example-usage)\n  - [Ask a Question](#ask-a-question)\n  - [Evaluation of WikiRag](#evaluation-of-wikirag)\n    - [Key Elements of Evaluation](#key-elements-of-evaluation)\n    - [Focus of the Evaluation Notebook](#focus-of-the-evaluation-notebook)\n    - [Performance of WikiRag with and without Web Search](#performance-of-wikirag-with-and-without-web-search)\n    - [Web Search for Enhanced Context](#web-search-for-enhanced-context)\n  - [WikiRag Q\u0026A System: Streamlit Application](#wikirag-qa-system-streamlit-application)\n- [Vectorization Pipeline](#vectorization-pipeline)\n  - [Prerequisites](#prerequisites)\n  - [Pipeline Overview](#pipeline-overview)\n  - [Running the Pipeline](#running-the-pipeline)\n    - [Using a Python Automation Script with Invoke](#using-a-python-automation-script-with-invoke)\n- [Qdrant](#qdrant)\n  - [Potential Problem](#potential-problem)\n  - [Solution](#solution)\n- [Downloading a LLaMA Model Locally Using `ollama`](#downloading-a-llama-model-locally-using-ollama)\n  - [Prerequisites](#prerequisites-1)\n  - [Installation of `ollama`](#installation-of-ollama)\n  - [Downloading a LLaMA Model](#downloading-a-llama-model)\n\n\n### Code directory\n```\n└──WikiRag\n   ├─── `app`: the streamlit app\n   ├─── `conda`: all the conda environments\n   ├─── `data`:  all the data\n   ├─── `images`: all the images if Any\n   ├─── `notebooks`: all the notebooks if Any\n   ├─── `vectorization_pipeline`: all the vectorization pipeline scripts\n   ├─── `wiki_rag`: the WikiRag class\n   ├─── `.gitignore`\n   ├─── `README.md`\n   ├─── `wikipedia_urls.txt`: the txt file with all the urls \n   └─── `requirements.txt`\n   \n```\n## WikiRag: Conversational RAG with Wikipedia Knowledge Base\n\nThe `WikiRag` class provides a framework for building a conversational AI system that leverages Wikipedia articles as its knowledge base. It integrates various components like `Ollama`, `HuggingFaceEmbeddings`, and `Qdrant` to create a powerful system capable of answering user queries using context retrieved from Wikipedia.\n\n### How It Works\n\n- **Qdrant Integration**: The class connects to a [Qdrant](#qdrant) vector database, which contains vectorized Wikipedia articles.\n- **Embedding Model**: `HuggingFaceEmbeddings` is used to convert queries into embeddings, which are then matched against the vectors in the Qdrant collection.\n- **Retriever**: The vector store acts as a retriever, fetching the top relevant documents based on the query.\n- **Chain Construction**: A processing chain is built that retrieves relevant documents and generates answers using the `Ollama` model.\n- **Web Search Integration**: If the retrieved context from the knowledge base is insufficient, the system expands the context by performing a web search. This is done using the `DuckDuckGo search engine` to find additional relevant information on the web.\n\n### Example Usage\n\nTo test this library it is possible to use the notebook `notebooks/wiki_rag.ipynb`.\n\n```python\nfrom wiki_rag import WikiRag\n\n# Initialize the WikiRag class\nwiki_rag = WikiRag(\n    qdrant_url=\"http://localhost:6333\",\n    qdrant_collection_name=\"olympics\"\n)\n```\n\n### Ask a question\n```python\nresponse = wiki_rag.invoke(\"Quale città ospitò i primi Giochi Olimpici estivi dell’età moderna? In che anno?\")\nprint(response)\n# La città che ospitò i primi Giochi Olimpici estivi dell'età moderna fu Atene, in Grecia, nel 1896.\n```\n\n## Evaluation of WikiRag\n\nTo ensure the effectiveness of the WikiRag system, we provide a comprehensive evaluation process, which can be found in the notebook `evaluate_wiki_rag.ipynb`. This notebook guides you through the evaluation of the main components of the RAG (Retrieval-Augmented Generation) application, focusing on generation aspects.\n\n### Key Elements of Evaluation\n\nThe main elements to evaluate in a RAG application like WikiRag are:\n\n- **Retrieval**: This involves experimenting with different data processing strategies, embedding models, and other factors to see how they impact the retrieval performance. The goal is to identify the settings that retrieve the most relevant documents from the knowledge base.\n\n- **Generation**: Once the best retrieval settings are determined, the next step is to experiment with different large language models (LLMs) to find the best model for generating accurate and contextually relevant answers.\n\n### Focus of the Evaluation Notebook\n\nIn the `evaluate_wiki_rag.ipynb` notebook, the [evaluation](https://www.mongodb.com/developer/products/atlas/evaluate-llm-applications-rag/) is centered on the overall generative performance of the WikiRag class, with particular focus on the following aspects:\n\n- **Answer Semantic Similarity**: This metric measures how semantically similar the generated answer is to the ground truth. A higher score indicates that the model's answer closely aligns with the intended meaning of the correct answer.\n\n- **Answer Correctness**: This metric evaluates the factual accuracy of the generated answer in comparison to the ground truth. It assesses whether the information provided by the model is correct.\n\nTo facilitate this evaluation, we use the [Ragas](https://docs.ragas.io/en/stable/) library, which provides a robust framework for assessing the quality of RAG systems.\n\nBy following the evaluation steps outlined in the notebook, you can systematically assess the performance of the WikiRag system, identify areas for improvement, and fine-tune the model to better meet your application's needs.\n\n### Performance of WikiRag with and without Web Search\n\nThe WikiRag system demonstrates improved performance when web search is used to expand the context for answering queries:\n\n- **Without Web Search**:\n  - **Answer Similarity**: 0.7464\n  - **Answer Correctness**: 0.2335\n\n- **With Web Search**:\n  - **Answer Similarity**: 0.8167\n  - **Answer Correctness**: 0.3875\n\nThese metrics indicate that incorporating web search significantly enhances both the semantic similarity and factual accuracy of the generated answers, making the system more effective in providing relevant and correct responses.\n\n### Performance of WikiRag with and without Web Search (Improved Prompt)\n\n- **Without Web Search**:\n  - **Answer Similarity**: 0.7464\n  - **Answer Correctness**: 0.2335\n\n- **With Web Search**:\n  - **Answer Similarity**: 0.9203\n  - **Answer Correctness**: 0.6358\n\nAs it is possible to denote creating a tailored prompt is an essential step in all the GenAI tasks, indeed, techniques of prompt engineering are fundamentals to improve the performances and capabilities of AI systems. \n\n### Web Search for Enhanced Context\n\nTo enhance the context available for generating responses, the WikiRag class includes a method to search the web when the knowledge base does not provide sufficient information. This method uses DuckDuckGo to perform the search and integrates the additional context into the response generation process. The web search is the default behavior, to disable the web search initilaise the WikiRag object as:  \n```python\nfrom wiki_rag import WikiRag\n\n# Initialize the WikiRag class\nwiki_rag = WikiRag(\n    qdrant_url=\"http://localhost:6333\",\n    qdrant_collection_name=\"olympics\",\n    expand_context=False\n)\n```\n## WikiRag Q\u0026A System: Streamlit Application\n\nThe `WikiRag Q\u0026A System` is an interactive web application built using Streamlit that allows users to ask questions based on the underlying KB, accurate answers generated by the `WikiRag` class.\n\n### Features\n\n- **Interactive Q\u0026A Interface**: Users can input questions related to the Olympic Games, and the system will provide answers by leveraging a knowledge base and optional web context.\n- **Real-Time Response**: The app processes user queries in real-time and displays the response with a sleek, user-friendly interface.\n- **Question History**: The app keeps track of all questions asked during the session and displays a history for easy reference.\n\n### How to Run the App\n\nTo run the `WikiRag Q\u0026A System` Streamlit application, follow these steps:\n\n1. **Install Dependencies**:\n   Ensure that Streamlit and all other required packages are installed in your environment. You can leverage both:\n   - the preconfigured `conda/wiki_rag.yaml` file to create a conda env with all the packages to run the app\n   - Install all the necessary packages using `requirements.txt` file as: `pip install -r requrement.txt`\n\n2. **Run the application**:\n\nMove in the app folder and run: \n\n```bash\ncd app\nstreamlit run app.py\n```\n\n3. **Access the Application**: \nOnce the app is running, open your web browser and go to http://localhost:8501 to access the Q\u0026A system.\n\n### Screenshot \n![streamlit](images/app.png)\n\n## Vectorization Pipeline\n\nThe Vectorization Pipeline is a series of automated steps to process Wikipedia pages, split the content into manageable chunks, generate embeddings for each chunk, and load them into a [Qdran](#qdrant) vector database. This pipeline is essential for transforming raw Wikipedia data into a structured format that can be used for advanced search and retrieval tasks.\n\n### Prerequisites\n\nBefore running the pipeline, ensure you have the following:\n\n1. **Python 3.7+** installed on your system.\n2. **Qdrant instance** up and running. Refer to the [Qdrant](#qdrant) section for details on how to set up Qdrant.\n\n### Pipeline Overview\n\nThe pipeline consists of three main steps:\n\n1. **Processing Wikipedia Pages**: Acquires and cleans the text from Wikipedia pages, removing stopwords and other unnecessary elements.\n2. **Chunking the Processed Content**: Splits the cleaned text into smaller chunks and generates vector embeddings using the `SentenceTransformer`.\n3. **Loading Chunks into Qdrant**: Inserts the generated chunks into a Qdrant collection as vector points.\n\n### Running the Pipeline\n\n1. Execute the three python scripts sequentially, optionally modifying the input and output params:\n  - `document_acquisition.py`\n  - `wikipedia_chunker.py`\n  - `qdrant_loader.py`\n\nremember to activate [Qdrant](#qdrant) to be able to successfully load the qdrant points in the vector database. \n\n2. You can run the entire pipeline in one go using a script `vectorization_pipeline/tasks.py`.\n\n#### Using a Python Automation Script with Invoke\n\nFor a more flexible and cross-platform solution, you can use a Python script with the `Invoke` library:\n\n1. Install `Invoke` or use directly the `wiki_rag.yaml` to create the `wiki_rag` environment which already contains all the necessary packages to execute the documents vectorization pipeline:\n\n    ```bash\n    pip install invoke\n    ```\n\n2. Run the pipeline with the following command (it is suggested to run it from the root directory of this project to avoid problems with the paths):\n\n    ```bash\n    python -m invoke --search-root vectorization_pipeline full-vectorization-pipeline\n    ```\n\nIt is also possible to personalize the params of the vectorization pipeline, see `vectorization_pipeline/tasks.py` for how to do that. \n\n####  Qdrant\n\nTo load the chunks into Qdrant, you need an instance of Qdrant up and running. Qdrant is a vector database optimized for handling embeddings and can be used for similarity search, nearest neighbor search, and other tasks.\n\nTo set up Qdrant locally or in the cloud, follow the instructions in the [Qdrant](#qdrant) section. \n\n\n#### Conclusions\n\nThis Vectorization Pipeline simplifies the process of extracting, processing, chunking, and storing Wikipedia data into a Qdrant vector database. By following the steps outlined in this section, you can quickly deploy the pipeline and begin using Qdrant for advanced search and retrieval tasks.\n\n\n## Qdrant \n\nQdrant is an open-source, high-performance vector database designed for handling large-scale search and similarity queries with ease. This section explains how to set up [Qdrant](https://qdrant.tech/documentation/quickstart/) locally using [Docker](https://docs.docker.com/manuals/).\n\n1. [Install Docker Engine](https://docs.docker.com/desktop/install/windows-install/)\n\n    1.1 Once Installed is it possible to test it running the command below:\n        `docker version`\n\n2. Pulling Qdrant image hosted in the Docker Hub:\n   `docker pull qdrant/qdrant`\n\n3. Run the qdrant instance locally, run the command below in your terminal. It \n   is suggested to run the below command in the root directory of this repository. \n\n   ```bash\n   docker run -p 6333:6333 -v $(pwd):/qdrant/storage qdrant/qdrant\n   ```\n\n### Potential Problem\n\nMight happen to receive this error message: \n```bash\ndocker: invalid reference format: repository name\n```\nThe error you're encountering is due to the incorrect syntax for specifying the volume when running Docker in a Windows environment. The `.` symbol, which represents the current directory in Unix-based systems, is not directly compatible in this context when passed to Docker on Windows.\n\n### Solution \n\n1. Find the absolute path of your project directory.\n```bash\npwd\n```\n   The output might look something like this:\n```bash\n/c/your_path/WikiRag\n```\n\n2. Convert the Unix-style path to a Windows-style path.\n\nReplace `/c/` with `C:/`, and ensure that the slashes (`\\`) are backslashes (`/`)\n\n```bash\nC:/your_path/WikiRag\n```\n\nNow you can use the correct absolute path instead of the variable `$(pwd)` to launch the Qdrant vectorstore. \n\n```bash\ndocker run -p 6333:6333 -v C:/your_path/WikiRag:/qdrant/storage qdrant/qdrant\n```\n\n   4. A successful run will look like below:\n   ```bash\n    2024-08-23 10:02:09            _                 _    \n    2024-08-23 10:02:09   __ _  __| |_ __ __ _ _ __ | |_  \n    2024-08-23 10:02:09  / _` |/ _` | '__/ _` | '_ \\| __| \n    2024-08-23 10:02:09 | (_| | (_| | | | (_| | | | | |_  \n    2024-08-23 10:02:09  \\__, |\\__,_|_|  \\__,_|_| |_|\\__| \n    2024-08-23 10:02:09     |_|                           \n    2024-08-23 10:02:09 \n    2024-08-23 10:02:09 Version: 1.11.0, build: 63363956\n    2024-08-23 10:02:09 Access web UI at http://localhost:6333/dashboard\n    2024-08-23 10:02:09 \n    2024-08-23 10:02:09 2024-08-23T08:02:09.168280Z  INFO storage::content_manager::consensus::persistent: Initializing new raft state at ./storage/raft_state.json    \n    2024-08-23 10:02:09 2024-08-23T08:02:09.292041Z  INFO qdrant: Distributed mode disabled    \n    2024-08-23 10:02:09 2024-08-23T08:02:09.293079Z  INFO qdrant: Telemetry reporting enabled, id: 77988a47-9fcc-4dc7-8ea3-f63a6ee99d05    \n    2024-08-23 10:02:09 2024-08-23T08:02:09.304312Z  INFO qdrant::actix: TLS disabled for REST API    \n    2024-08-23 10:02:09 2024-08-23T08:02:09.305016Z  INFO qdrant::actix: Qdrant HTTP listening on 6333    \n    2024-08-23 10:02:09 2024-08-23T08:02:09.305072Z  INFO actix_server::builder: Starting 19 workers\n    2024-08-23 10:02:09 2024-08-23T08:02:09.305083Z  INFO actix_server::server: Actix runtime found; starting in Actix runtime\n    2024-08-23 10:02:09 2024-08-23T08:02:09.305291Z  INFO qdrant::tonic: Qdrant gRPC listening on 6334    \n    2024-08-23 10:02:09 2024-08-23T08:02:09.305344Z  INFO qdrant::tonic: TLS disabled for gRPC API\n   ```\n\n  Do note that the TLS is disabled and so you can access the dashboard on the http://localhost:6333/dashboard. If you were to visit this url then you can view the qdrant dashboard.\n\n5. Test the Qdrant vectorstore using the notebook `notebooks\\test_qdrant_client.ipynb`\n\n\n## Downloading a LLaMA Model Locally Using `ollama`\n\nThis section guides you through the process of downloading a LLaMA (Large Language Model) model locally using the `ollama` CLI. The `ollama` CLI provides an easy way to manage, download, and run large language models on your local machine.\n\n### Prerequisites\n\nBefore you begin, ensure that you have the following:\n\n1. **Ollama CLI Installed**:\n    - You need to have the `ollama` CLI installed on your system. You can download and install it from the official [Ollama website](https://ollama.com/).\n\n### Installation of `ollama`\n\nTo install `ollama`, follow these steps:\n\n1. **Download and Install `ollama`**:\n    - Visit the [Ollama installation page](https://ollama.com/download) and follow the installation instructions specific to your operating system (Windows, macOS, or Linux).\n\n2. **Verify Installation**:\n    - After installation, open your terminal (Command Prompt, PowerShell, or Bash) and verify that `ollama` is installed correctly by running:\n\n      ```bash\n      ollama --version\n      ```\n\n### Downloading a LLaMA Model\n\nOnce `ollama` is installed, you can use it to download a LLaMA model locally.\n\n1. **List Available Models**:\n    - To see the available models for download, you can use the following command:\n\n      ```bash\n      ollama list\n      ```\n   - You can also refer to the [github page](https://github.com/ollama/ollama#model-library), which lists all the avaialable models. \n\n2. **Download a Specific LLaMA Model**:\n    - To download a specific LLaMA model, use the `ollama pull` command followed by the model's name. For example, to download the LLaMA3.1-8B model:\n\n      ```bash\n      ollama pull llama3.1\n      ```\n    - This command will download the model to your local machine, making it available for use in your projects.\n\n3. **Verify the Download**:\n    - After the download is complete, you can verify that the model has been downloaded by listing the installed models:\n    \n      ```bash\n      ollama list installed\n      ```\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmauroandretta%2Fwikirag","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmauroandretta%2Fwikirag","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmauroandretta%2Fwikirag/lists"}