Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/hiper2d/ai-llm-playground

Chatbot web-applications with LLM, OpenAI API Assistants, LangChain, vector databases, and other AI stuff
https://github.com/hiper2d/ai-llm-playground

chatbot gpt-4 langchain mongodb-atlas openai openai-assistants python vector-database

Last synced: 3 months ago
JSON representation

Chatbot web-applications with LLM, OpenAI API Assistants, LangChain, vector databases, and other AI stuff

Awesome Lists containing this project

README

        

# ai-llm-playground
Experiments with OpenAI Assistant API, Langchain, Embedding, and Agents

### Table of content
- [Restaurant Advisor (OpenAI Assistant version)](#restaurant_advisor)
- [Web Scraper based on Vision API](#web_scraper)
- [Chat with PDF documents using OpenAI Assistant API](#chat_with_pdf)
- [Restaurant Advisor (Outdated Langchain + Redis version)](#restaurant_advisor_langchain)
- [AI Girlfriend](#ai_girlfriend)
- [Chat with Multiple Documents (Outdated Langchain version)](#chat_with_pdf_langchain)
- [Setup](#setup)

# Restaurant Advisor (OpenAI Assistant version)

This is the continuation of the [Restaurant Advisor (outdated Langchain + Redis version)](#restaurant_advisor_langchain) project.
I decided to get rid of Langchain and switch to the native OpenAI API. There are few reasons for this:
- OpenAI API now supports agents out of the box (Assistants + Threads). This is basically all I need in my development
- I prefer controllable low level solutions over magical boxes. It is hard to override standard LangChain Agent's behaviour (prompts, output parsers) when you face some limitations. I found it easier and more flexible to write my own custom code rather than using predefined retrieval tools.
- I got tired of dealing with issues after updating GPT models and libraries

So, It's pure OpenAI API now.

- I use an Assistant API with custom tools: vector semantic search with location pre-filtering (MongoDb Atlas) and image generation (DALL-E 3)
- I don't need Redis for conversation history anymore because OpenAI Threads can do the same
- I use the latest `GPT-4 Turbo` model
- I use OpenAI voice generation

The core of this project is the Assistant with few tools. It is capable of doing the following:
- keep the conversation with the user, suggest restaurants and dishes from the database
- understand when to query the database and come up with queries to find the best restaurants nearby and use the result in the conversation
- understand when a user wants to see a particular dish and generate an image of it using DALL-E 3
- reliably reply in JSON format even though Assistant API doesn't support the JSON output format

![advisor-chat-example.png](images/advisor-chat-example-1.png)

Examples of the generated images during a conversation:

I'll add more details about how to create the database with indexes in MongoDb Atlas and how to deploy this to AWS later.
I plan to create some architectural diagrams as well. Even thought there is not so much to architect here, but still. There are tools and some tricks with location pre-filtering which require some explanation to those who want to do the same.

# Web Scraper based on Vision API

Credits go to [gpt4v-browsing](https://github.com/unconv/gpt4v-browsing) repo as a source of ~~copy-paste~~ inspiration. The idea is nice and simple: let's make a screenshot of a web page, then ask OpenAI Vision API to recognize stuff on it. And then answer to my questions regarding the page content. And maybe even navigate through pages and click buttons for collecting more data before answering. I always hated scraping autogenerated HTML pages, this is a super nice alternative. If it works, of course. To be honest, I'm not so sure about it. Let's find out.

Few problems I met:
- Vision API is not so good at answering questions about the text page content. For some reason, the result is better if you first ask the Vision API to extract some specific text from the screenshot, then ask the Text Completion API to answer to your question using the text from the previous step.
- Vision API refuses to recognize a lot of text on the page. So it is not possible to ask it to extract all the text. You have to be specific and ask to extract only some text. Ideally, the Vision API prompt should also be constructed out of the original question.
- Vision API cannot extract all the related text. For example, when i ask to give me all the horror movies it sees on the page, it never gives all of them.

What I have in mind for this project so far:
- Give a URL and a task to the agent. For example, some cinema site and a task to find a movie to watch tonight.
- The agent should be able to take a screenshot of the root page, come up with the right vision prompt, extract some text and make a decision if it want to navigate further or not.
- To make the navigation possible, the agent should be able to recognize links on the page.
- It would be nice to intake into account an IMDB film rating before suggestion a movie to watch. Thus, this getting a rating though API call should be one of the agent's tool.
- I'm going to use the Puppeteer js library to take screenshots and navigate through pages. This might be tricky to integrate a js library into Python code. I'll see how it goes.

More stuff coming later

Project Setup is in a separate [web-scraper/README.md](web-scraper/README.md) file.

# Chat with PDF documents using OpenAI Assistant API

This is a better version of the [Chat with Multiple Documents (Outdated Langchain version)](#chat_with_pdf_langchain) because it uses native OpenAI API and Assistant API with the latest model. No need to parse PDFs manually and upload their text content into vector stores. It is all done on the OpenAI side

![summarizer_1](images/summarize/summarizer_1.png)

This agent wants to be convinced:

![summarizer_2](images/summarize/summarizer_2.png)

# Restaurant Advisor (Outdated Langchain + Redis version)

This chatbot is aware of restaurant database in MongoDB and is capable of finding the best one nearby. It combines vector semantic search with geo-location MongoDb Atlas Index search. It keeps the chatbot conversation history in Redis. It is quite awesome, the most advanced AI project I did so far.

I chose MongoDb as a vector store because of multiple reasons:
- I can to keep documents in a cloud, not only vectors
- My documents are not just text chunks but complex JSON object with a schema
- Each document has `embedding` and `location` fields that are indexed and can be used for fast semantic and geo-location search
- I use geo-location search as a filter for the following vector search. I.e. I limit the search to the restaurants nearby and then I use vector search to find the best one.
- I can use MongoDB native queries if I feel limitations with Langchain API (or in case of bugs which I encountered a few times)

I plan to deploy this to AWS Lambda eventually (I hope soon), thus I need to keep conversation history somewhere. I chose Redis. It is supported by Langchain.
The application supports StreamLit and Flask servers.

To start it locally run a Redis container using the `docker-compose.yml`:
```bash
docker-compose up
```
Then start the Python application as usual (see below).

![restaurant-advisor.png](images/restaurant-advisor.png)

# AI Girlfriend

Okay, this is not actually a girlfriend but more like an interesting person with some technical background. At first, I took some custom prompts for chatbots with AI-girlfriend personality from [FlowGPT](https://flowgpt.com/). But they all were either anime or virtual sex oriented (usually both) which I found rather boring. I came up with my own prompt that focuses on making the chatbot more alive and natural. I prohibited her to mention that she is an AI and gave some background in engineering so she is quite nerdy. I also tried to make her more open-minded that a regular Chat GPT, therefore she has some temper and can even insult you (she called me stupid once). She can talk using AI-generated voice which is very impressive.

In the end, this is a simple chatbot created with Langchain, Streamlit, and OpenAI API. It can voice-talk almost like a real human using Elevenlabs API.
I use the [Elevenlabs](https://elevenlabs.io/speech-synthesis) API (which is free) to generate a voice in a browser (StreamLit allows to play it).

![ai-girlfriend.png](images/ai-girlfriend.png)

# Chat with Multiple Documents (Outdated Langchain version)

Here I use vector database to store txt documents' content. Langchain with `stuff` chain type allows to query this store and use it in chatting with llm

![multi-doc.png](images/multi-doc.png)

# Setup

Rename the [.env.template](.env.template) file into `.env` and fill in the values.

### Pipenv setup

I use `pipenv` to manage dependencies. Install it, create a virtual environment, activate it and install dependencies.

1. Install `pipenv` using official [docs](https://pipenv.pypa.io/en/latest/install/#installing-pipenv). For example, on Mac:
```bash
pip install pipenv --user
```

2. Add `pipenv` to PATH if it's not there. For example, I had to add to the `~/.zshrc` file the following line:
```bash
export PATH="/Users/hiper2d/Library/Python/3.11/bin:$PATH"
```

3. Install packages and create a virtual environment for the project:
```bash
cd # navigate to the project dir
pipenv install
```
This should create a virtual environment and install all dependencies from `Pipfile.lock` file.

If for any reason you need to create a virtual environment manually, use the following command:
```bash
pip install virtualenv # install virtualenv if you don't have it
virtualenv --version # check if it's installed
cd # for example, my virtual envs as here: /Users/hiper2d/.local/share/virtualenvs
virtualenv # I usually use a project name
```

4. To swtich to the virtual environment, use the following command:
```bash
cd
pipenv shell
```
If this fails, than do the following:
```bash
cd /bin
source activate
```

### Intellij Idea/PyCharm Run/Debug setup

1. Add a Python Interpreter. Idea will generate a virtual environment for you.
- Go to Project Settings > SDK > Add SDK > Python SDK > Pipenv Environment
- Add paths to python and pipenv like this:
![add-python-interpreter.png](images/add-python-interpreter.png)

2. Create a Python StreamLit Run/Debug configuration like this:
![streamlit-run-debug-config.png](images/streamlit-run-debug-config.png)

3. Create a Python Flask Run/Debug configuration (in dish-adviser only) like this:
![flask-run-debug-config.png](images/flask-run-debug-config.png)