Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/itspreto/vectr8

Embed anything.
https://github.com/itspreto/vectr8

embeddings llms vector-database vector-database-embedding vector-similarity-search

Last synced: about 2 months ago
JSON representation

Embed anything.

Awesome Lists containing this project

README

        

VECT.R8 (Vector Embeddings Creation, Transformation & Retrieval) 🚀


logo


A Web UI where you can upload CSV/JSON files, create vector embeddings, and query them. Soon, you'll be able to convert unstructured data to JSON/CSV using an integrated LLM.


VECTR8-demo-ezgif com-video-to-gif-converter



Project under heavy/active development [may be] unstable. Embeddings and Query pages WIP ⚠️

Table of Contents




Section
Links


Prerequisites





Installation



Running the Application



Uploading Files 📂





Previewing Data 🧐





Creating Vector Embeddings 🧩





Querying the Vector Database 🔍





Managing the Vector Database 🛠️





UI Walkthrough 🎨





-----

Prerequisites




Requirement
Description


Python Python 3.7+
The application requires Python 3.7 or higher to leverage modern libraries and syntax.


flask icon Flask
Essential for running embedding models. Utilized by transformers.


Flask-CORS Flask-CORS
Enables CORS for frontend-backend communication.


transformers transformers
Used for creating vector embeddings with pre-trained models.


fire icon torch
Essential for running embedding models. Utilized by transformers.


numpy numpy
Handles arrays and mathematical operations. Used throughout the application.


pandas pandas
Processes CSV and JSON files. Utilized throughout the application.



Installation




Step
Instructions


Clone the repository
git clone https://github.com/itsPreto/VECTR8.git

cd VECTR8



Install the required packages
pip install -r requirements.txt



Running the Application




Step
Instructions


Start the Flask server
python3 rag.py



Automatically launch React frontend
The Python endpoint will launch the React frontend in a separate subprocess.


Open your web browser
Navigate to http://127.0.0.1:4000


Uploading Files 📂




Step
Instructions


Drag and Drop a File
Drag and drop a CSV or JSON file into the upload area or click to select a file from your computer.


View Uploaded File Information
Once uploaded, the file information such as name and size will be displayed.


Command Line


To upload a file using curl:


curl -X POST -F 'file=@/path/to/your/file.csv' http://127.0.0.1:4000/upload_file

Previewing Data 🧐




Step
Instructions


Select Embedding Keys
After uploading a file, select the keys (columns) you want to include in the embeddings.


Preview Document
View a preview of the document created from the selected keys.


Preview Embeddings
View the generated embeddings and token count for the selected document.


Command Line


To preview a file's keys using curl:


curl -X POST -H "Content-Type: application/json" -d '{"file_path":"uploads/your-file.csv"}' http://127.0.0.1:4000/preview_file

To preview a document's embeddings using curl:


curl -X POST -H "Content-Type: application/json" -d '{"file_path":"uploads/your-file.csv", "selected_keys":["key1", "key2"]}' http://127.0.0.1:4000/preview_document

Creating Vector Embeddings 🧩




Step
Instructions


Start Embedding Creation
Click the "Create Vector DB" button to start the embedding creation process.


View Progress
Monitor the progress of the embedding creation with a circular progress indicator. 📈


Command Line


To create a vector database using curl:


curl -X POST -H "Content-Type: application/json" -d '{"file_path":"uploads/your-file.csv", "selected_keys":["key1", "key2"]}' http://127.0.0.1:4000/create_vector_database

Querying the Vector Database 🔍




Step
Instructions


Enter Query
Type your query into the input field.


Select Similarity Metric
Choose between cosine similarity or Euclidean distance.


Submit Query
Click the "Submit" button to query the vector database.


View Results
Inspect the results, which display the document, score, and a button to view detailed data.


Command Line


To query the vector database using curl:


curl -X POST -H "Content-Type: application/json" -d '{"query_text":"Your query text here", "similarity_metric":"cosine"}' http://127.0.0.1:4000/query

Managing the Vector Database 🛠️




Step
Instructions


Backup Database
Click the "Backup Database" button to create a backup of the current vector database.


Delete Database
Click the "Delete Database" button to delete the current vector database.


View Database Statistics
View statistics such as total documents and average vector length.


Command Line


To check if the vector database exists using curl:


curl -X GET http://127.0.0.1:4000/check_vector_db

To view database statistics using curl:


curl -X GET http://127.0.0.1:4000/db_stats

To backup the database using curl:


curl -X POST http://127.0.0.1:4000/backup_db

To delete the database using curl:


curl -X POST http://127.0.0.1:4000/delete_db

UI Walkthrough 🎨




Feature
Description


Uploading Files


  • Drag and drop a file into the upload area or click to select a file.

  • File information will be displayed after a successful upload.





Previewing Data


  • Select the keys you want to include in the embeddings.

  • View a preview of the document and generated embeddings.





Creating Vector Embeddings


  • Click the "Create Vector DB" button to start the embedding creation.

  • Monitor the progress with the circular progress indicator.





Querying the Vector Database


  • Enter your query text and select a similarity metric.

  • Click "Submit" to query the database and view the results.





Managing the Vector Database


  • Backup the database by clicking "Backup Database".

  • Delete the database by clicking "Delete Database".

  • View database statistics such as total documents and average vector length.