https://github.com/alfonsokan/eskwelabs_chatbot

A RAG chatbot that answers both Eskwelabs bootcamp-specific queries and general bootcamp-related questions.
https://github.com/alfonsokan/eskwelabs_chatbot
embeddings genai genai-chatbot ollama prompt-engineering retrieval-augmented-generation streamlit vector-database
Last synced: 3 months ago
JSON representation
A RAG chatbot that answers both Eskwelabs bootcamp-specific queries and general bootcamp-related questions.
Host: GitHub
URL: https://github.com/alfonsokan/eskwelabs_chatbot
Owner: alfonsokan
Created: 2024-09-07T17:23:11.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2024-09-08T16:50:33.000Z (about 1 year ago)
Last Synced: 2025-08-14T05:03:09.964Z (3 months ago)
Topics: embeddings, genai, genai-chatbot, ollama, prompt-engineering, retrieval-augmented-generation, streamlit, vector-database
Language: Python
Homepage: https://askwelabscapstoneproject.streamlit.app/
Size: 3.34 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.MD
Awesome Lists containing this project

README

          # Eskwelabs Chatbot

This project focuses on the end-to-end development of a Q&A Chatbot tailored to answer bootcamp-related queries, specifically for Eskwelabs. The methodology covers the key steps, including knowledge base embedding,  Retrieval-Augmented Generation (RAG) chatbot development using LangChain, and deployment via Streamlit. 

![image](https://github.com/user-attachments/assets/3dd5ed7f-c6aa-4778-ac5e-8fd789fcda6c)

**Disclaimer**: This guide demonstrates how to create your own chatbot using `Llama3.1`. However, `Llama3.1` struggles with making multiple tool calls simultaneously. In contrast, `GPT-Turbo-3.5` handles this task more efficiently. For improved results, `GPT-Turbo-3.5` is recommended over `Llama3.1`. You can try out the chatbot powered by GPT-Turbo-3.5 [here.](https://askwelabscapstoneproject.streamlit.app/)




## Tech Stack

- ChromaDB: Vector Store

- LangChain: Chatbot Framework

- Llama3.1: Large Language Model

- text-embedding-ada-002: Embedding Model

- SemanticChunker: Chunking Strategy




## Installation

1. Clone the repository

```bash

git clone https://github.com/alfonsokan/eskwelabs_chatbot.git

```

2. Install libraries

```bash

pip install -r requirements.txt

```

3. Install an open-source LLM using Ollama. Refer to the [Ollama documentation](https://github.com/ollama/ollama) and select an LLM.




Then, run the following command in the command line (CMD):

```bash

ollama run llama3.1

```

4. For the code repository, open terminal and run the following command:

```python

streamlit run app.py

```




## Methodology

The flow chart below displays the 4-step approach to developing the chatbot.

![Askwelabs_Capstone-Project_Group-3](https://github.com/user-attachments/assets/72319f8f-0e94-4b05-8aa8-9fdadb78e640)

**1. Data Preparation**

For this step, the documents are embeddings and stored in the vector database called `embeddings_deployment_sentencetransformer` located in this repository. 

If interested, the code for embedding the documents can be viewed [here](https://colab.research.google.com/drive/1iyz_SkHv7TVDgKJBuRYb1iTtVfxDyGU0?usp=sharing).




**2. Retriever Generation**

- There are two retriever tools, Eskwelabs Info Retriever and General Bootcamp Info Retriever, created from the embedded knowledge base.

![image](https://github.com/user-attachments/assets/f2926ccf-800a-4d8e-8f18-9e8b62697335)

- Another retriever is optionally used when a user submits their resume to the chatbot.

    

![image](https://github.com/user-attachments/assets/dc6b0a63-9e09-49dc-bc5b-1df98b1e3343)




**3. Tool-calling Agent Creation**

Three parameters to instantiate a tool-calling agent:

- List of retriever tools

```python

    resume = st.file_uploader("Upload File", type=['txt', 'docx', 'pdf'])

    # if a resume is passed, include resume retriever as tool

    if resume is not None:

        with open(resume.name, "wb") as f:

            f.write(resume.getbuffer())

        resume_tool = resume_retriever_tool(resume.name)

        eskwelabs_bootcamp_info_search_tool, bootcamp_vs_alternatives_search_tool = create_db_retriever_tools(vectordb)

        tools = [resume_tool, eskwelabs_bootcamp_info_search_tool, bootcamp_vs_alternatives_search_tool]

    # if no resume is passed, do not include resume retriever as tool

    else:

        eskwelabs_bootcamp_info_search_tool, bootcamp_vs_alternatives_search_tool = create_db_retriever_tools(vectordb)

        tools = [eskwelabs_bootcamp_info_search_tool, bootcamp_vs_alternatives_search_tool]

```

- LLM (Llama 3.1)

```python

from langchain_ollama import ChatOllama

llm = ChatOllama(

    model = "llama3.1",

    temperature = 0.1,

    num_predict = 350,

    verbose=True.

)

```

- Prompt passed to the chatbot

```python

from langchain_core.prompts import ChatPromptTemplate

from langchain_core.prompts import MessagesPlaceholder

prompt = ChatPromptTemplate(

    messages=[

        MessagesPlaceholder(variable_name='chat_history'),

        ('system', "You're a helpful assistant who provides concise, complete answers without getting cut off mid-statement. Stick strictly to the user's questions, avoiding any unnecessary details."),

        ('human', '{input}'),

        MessagesPlaceholder(variable_name="agent_scratchpad")       

                                                                    

    ]

)

```

Afterwards, the tool-calling agent can now be instantiated:

```python

from langchain.agents import create_tool_calling_agent, AgentExecutor

agent=create_tool_calling_agent(llm,tools,prompt)

agent_executor=AgentExecutor(agent=agent,tools=tools,verbose=True, handle_parsing_errors=True)

```




**4. Response Generation**

- Pass the tool-calling agent, user input, as well as chat history to generate a response.

```python

def process_chat(agent_executor, user_input, chat_history):

    response = agent_executor.invoke(

        {'input': user_input,

         'chat_history': chat_history

         },

    )

    return response['output']

```




## Recommendations

- Using a ReAct agent instead of a tool-calling agent

    -   Explore output quality using a ReAct agent instead of a tool-calling agent. Develop a ReAct prompt that enables the LLM to generate reasoning traces before taking action on a task.

- Explore different chunking strategies and embedding model

- Connect the chatbot to a third-party database (Redis-Upstash) to allow long-term storage of chat history




## Appendix

### Chatbot's Selective History Retrieval Mechanism

![image](https://github.com/user-attachments/assets/230b3557-7740-42b5-b261-8dd25345c9da)

- Past conversations’ user query and chatbot response are stored in a temporary vector store

- For each new user query, only the most relevant parts of the vector store are retrieved and passed as chat history to the chatbot.

- Importance: This feature reduces token consumption by efficiently retrieving only the relevant parts of the chat history, preventing unnecessary length and minimizing what is passed to the LLM.

- Code snippet of app with chat history implemented:

```python

if "messages" not in st.session_state:

    st.session_state.messages = []

if "chat_history" not in st.session_state:

    st.session_state.chat_history = []

if "unique_id" not in st.session_state:

    st.session_state.unique_id = 0

if "chat_history_vector_store" not in st.session_state:

    st.session_state.chat_history_vector_store = None

if "fed_chat_history" not in st.session_state:

    st.session_state.fed_chat_history = []

# Display chat messages from history on app rerun

for message in st.session_state.messages:

    with st.chat_message(message["role"]):

        st.markdown(message["content"])

# React to user input

if user_input := st.chat_input("Say something"):

    # Display user message in chat message container

    with st.chat_message("human"):

        st.markdown(user_input)

    # Add user message to chat history

    st.session_state.messages.append({"role": "user", "content": user_input})

    if st.session_state.chat_history_vector_store:

        results = st.session_state.chat_history_vector_store.similarity_search(query=user_input,

                                                                k=4,

                                                                filter={'use_case':'chat_history'})

        sequenced_chat_history = [(parse_message(results.metadata['msg_element']), results.metadata['msg_placement']) for results in results]

        sequenced_chat_history.sort(key=lambda pair: pair[1])

        st.session_state.fed_chat_history = [message[0] for message in sequenced_chat_history]

    # chatbot response

    response = process_chat(agent_executor, user_input, st.session_state.fed_chat_history)

    st.session_state.chat_history.append(HumanMessage(content=user_input))

    st.session_state.chat_history.append(AIMessage(content=response))

    formatted_human_message = format_message(HumanMessage(content=user_input))

    formatted_ai_message = format_message(AIMessage(content=response))

    # Display assistant response in chat message container

    with st.chat_message("assistant"):

        st.markdown(response)

    # Add assistant response to chat history

    st.session_state.messages.append({"role": "assistant", "content": response})

    # Add the last two messages (HumanMessage and AIMessage) to the vector store

    if st.session_state.chat_history_vector_store:

        st.session_state.chat_history_vector_store.add_texts(

            texts=[st.session_state.chat_history[-2].content, st.session_state.chat_history[-1].content], 

            ids=[str(st.session_state.unique_id), str(st.session_state.unique_id + 1)],

            metadatas=[

                {'msg_element': formatted_human_message, 'msg_placement': str(st.session_state.unique_id), 'use_case':'chat_history'},

                {'msg_element': formatted_ai_message, 'msg_placement': str(st.session_state.unique_id+1), 'use_case':'chat_history'}

            ],

            embedding=embedding_function

        )

        st.session_state.unique_id += 2

    else:

        # Initialize the vector store with the last two messages

        st.session_state.chat_history_vector_store = Chroma.from_texts(

            texts=[st.session_state.chat_history[-2].content, st.session_state.chat_history[-1].content], 

            ids=[str(st.session_state.unique_id), str(st.session_state.unique_id + 1)],

            metadatas=[

                {'msg_element': formatted_human_message, 'msg_placement': str(st.session_state.unique_id), 'use_case':'chat_history'},

                {'msg_element': formatted_ai_message, 'msg_placement': str(st.session_state.unique_id+1), 'use_case':'chat_history'}

            ],

            embedding=embedding_function

        )

        st.session_state.unique_id += 2

    

    st.session_state.chat_history = [] # after embedding convos to vector store, clear chat history before the end of the loop

```




### Knowledge Base Embedding

The code for the embedding of the knowledge base can be found [here.](https://colab.research.google.com/drive/1iyz_SkHv7TVDgKJBuRYb1iTtVfxDyGU0?usp=sharing)




### LangSmith Tracing

LangSmith can be a useful tool for debugging the chatbot application. To trace its runs, do the following:

- Create a LangSmith account

- Retrieve API key

- Create `.env` file

```python

LANGCHAIN_TRACING_V2=true

LANGCHAIN_API_KEY='ENTER_API_KEY_HERE'

LANGCHAIN_PROJECT='PROJECT_NAME'

```

- In the `app.py` file, make sure environment variables are loaded properly.

```python

from dotenv import load_dotenv

load_dotenv()

```

- LangSmith can be used to check if the LLM calls the appropriate tool/s for the prompt.

![image](https://github.com/user-attachments/assets/b848364f-512d-436f-9455-40d709112d7e)

![image](https://github.com/user-attachments/assets/2a8084b6-28a2-41e9-8ad9-f4b5d706e199)
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/alfonsokan/eskwelabs_chatbot

Awesome Lists containing this project

README