https://github.com/bbc-esq/vectordb-plugin
Plugin that lets you ask questions about your documents including audio and video files.
https://github.com/bbc-esq/vectordb-plugin
bark database-management embedding-models embedding-vectors embeddings gtts koboldai koboldcpp python rag retrieval-augmented-generation retrieval-chatbot tiledb vector-data-management vector-database vector-search vision whisper whispers2t whisperspeech
Last synced: 8 days ago
JSON representation
Plugin that lets you ask questions about your documents including audio and video files.
- Host: GitHub
- URL: https://github.com/bbc-esq/vectordb-plugin
- Owner: BBC-Esq
- Created: 2023-08-13T03:21:20.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2025-05-15T14:00:17.000Z (9 days ago)
- Last Synced: 2025-05-16T09:02:08.922Z (8 days ago)
- Topics: bark, database-management, embedding-models, embedding-vectors, embeddings, gtts, koboldai, koboldcpp, python, rag, retrieval-augmented-generation, retrieval-chatbot, tiledb, vector-data-management, vector-database, vector-search, vision, whisper, whispers2t, whisperspeech
- Language: Python
- Homepage: https://www.youtube.com/@AI_For_Lawyers
- Size: 32.8 MB
- Stars: 336
- Watchers: 7
- Forks: 42
- Open Issues: 10
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
🚀 Supercharged Vector Database!
Requirements
•
Installation
•
Using the Program
•
Request a Feature or Report a Bug
•
ContactCreate and search a vector database to get a response from the large language model that's more accurate. This is commonly referred to as "retrieval augmented generation" (RAG)! You can watch an introductory [Video](https://www.youtube.com/watch?v=8-ZAYI4MvtA) or read a [Medium article](https://medium.com/@vici0549/search-images-with-vector-database-retrieval-augmented-generation-rag-3d5a48881de5) about the program.
Graphic of How This Program Works

Requirements
| [🐍 Python 3.11](https://www.python.org/downloads/release/python-3119/) or [Python 3.12](https://www.python.org/downloads/release/python-3129/) • [📁 Git](https://git-scm.com/downloads) • [📁 Git LFS](https://git-lfs.com/) • [🌐 Pandoc](https://github.com/jgm/pandoc/releases) • [🛠️ Compiler](https://visualstudio.microsoft.com/) |
|---|The above link downloads Visual Studio as an example. Make sure to install the required SDKs, however.
>
> EXAMPLE error when no compiler installed:
>![]()
>
>
>
> EXAMPLE of installing the correct SDKs:
>![]()
>[Back to Top](#top)
Installation
### Step 1
Download the ZIP file for the latest "release." Extract its contents and navigate to the `src` folder.
> [!CAUTION]
> If you simply clone this repository you will get the development version, which might not be stable.
### Step 2
Within the `src` folder, create a [virtual environment](https://realpython.com/python-virtual-environments-a-primer/):
```
python -m venv .
```
### Step 3
Activate the virtual environment:
```
.\Scripts\activate
```
### Step 4
Run the setup script:
> Only ```Windows``` is supported for now.```
python setup_windows.py
```[Back to Top](#top)
🖥️Usage🖥️
> [!IMPORTANT]
> Instructions on how to use the program are being consolidated into the `Ask Jeeves` functionality, which can be accessed from the "Ask Jeeves" menu option. Please create an issue if Jeeves is not working.### Start the Program
```
.\Scripts\activate
```
```
python gui.py
```### 🏗️ Create a Vector Database Download an embedding model from the ```Models Tab```.
1. Set the `chunk size` and `chunk overlap` settings within the `Settings Tab`.
2. Within the `Create Database Tab`, select the files that you want in the vector database.
> 🖼️ images can be selected by clicking the `Choose Files` button.\
> 🎵 Audio files must be transcribed first within the `Tools Tab`.
3. Select the embedding model you want to use.
4. Click `Create Vector Database`.### 🔍 Query a Vector Database
* Select the database you want to search within the `Query Database Tab`.
* Select `Local Models`, `Kobold`, `LM Studio` or `ChatGPT` for the backend that you want to provide a response to your question.
* Click `Submit Question`.
> The `chunks only` checkbox will display the results from the vector database without getting a response.### ❓ Which Backend Should I Use?
If you use either the `Kobold` or `LM Studio` you must be familiar with those programs. For example, `LM Studio` must be running in "server mode" and handles the prompt formatting. However,`Kobold` automatically starts in server mode but requires you to specify the prompt formatting.
> [!TIP]
> Kobold [home page](https://github.com/LostRuins/koboldcpp), [instructions](https://github.com/LostRuins/koboldcpp/wiki), and [Discord server](https://koboldai.org/discord)\
> LM Studio [home page](https://lmstudio.ai/), [instructions](https://lmstudio.ai/docs), and [Discord server](https://discord.gg/aPQfnNkxGC).### 🗑️ Deleting a Database
* In the `Manage Databases Tab`, select a database and click `Delete Database`.[Back to Top](#top)
## Request a Feature or Report a BugFeel free to report bugs or request enhancements by creating an issue on github and I will respond promptly.
CONTACT
I welcome all suggestions - both positive and negative. You can e-mail me directly at "[email protected]" or I can frequently be seen on the ```KoboldAI``` Discord server (moniker is ```vic49```). I am always happy to answer any quesitons or discuss anything vector database related! (no formal affiliation with ```KoboldAI```).