An open API service indexing awesome lists of open source software.

https://github.com/sachs7/chat_llm

A RAG pipeline using LangChain, OpenAI API, Gradio to chat with your PDFs!
https://github.com/sachs7/chat_llm

chat gradio langchain llm pdf-chatbot rag

Last synced: 2 months ago
JSON representation

A RAG pipeline using LangChain, OpenAI API, Gradio to chat with your PDFs!

Awesome Lists containing this project

README

        

# Chat with your own data!

An app that lets you chat with your documents (PDFs); the app makes use of the OpenAI LLM using LangChain as the framework.

```
.
├── README.md
├── chat.py --> Main app
├── config.conf --> Configuration file that allows one to change the models etc
├── input_docs --> Document directory to store your PDFs
│ └── Python-ML.pdf
├── logs --> Log directory; app logs as well as user logs
├── requirements.txt
├── restart_service_on_changes.py --> Restarts the `chat.py` if there are changes in the `input_docs` folder
└── translator_langchain.py --> Used for supporting the translation of the queries
```

# How it works?

`chat.py` is the main app that connects to the LLM (OpenAI) based on the configuration that you give in the `config.conf` file.
First, we split the document based on the configurations and then create a database index which will be used later to query. The database that we use is a vector database called, 'ChromaDB'. This gets created when the script is executed for the first time. Also, for 'ChromaDB' to work, one needs to have a `Sqlite3 DB`, so make sure to install one.

Create a `.env` file and add your OpenAI API key as:

`OPENAI_API_KEY=sk-XYZ`

The app makes use of 'Gradio' framework; so make sure to run the following command to install the required libraries:

`pip install -r requirements.txt`

Once the required libraries are installed, you can start the application by:

`python3 chat.py`, will output the link which you can paste into your browser to access the app.

The `chat.py` in turn makes use of `translator_langchain.py` to provide a translation feature.

In case, the user wants the indexing to be updated whenever there are changes to the `input_docs` like when a new PDF file is added/removed; the user can make use of `restart_service_on_changes.py` -- this will start the `chat.py` internally, so you need to make sure to close any previous running instance of `chat.py`.

Note: If you are using `restart_service_on_changes.py` --> then expect brief service disruption whenever there are changes to the `input_docs` folder as the script will stop the `chat.py` momentarily to index the new files.