https://github.com/rupeshtr78/pg-vector-embed-rust
Pg Vector Embedding Rust
https://github.com/rupeshtr78/pg-vector-embed-rust
embeddings llm model pg-vector postgresql rust rust-lang vector-database
Last synced: 10 months ago
JSON representation
Pg Vector Embedding Rust
- Host: GitHub
- URL: https://github.com/rupeshtr78/pg-vector-embed-rust
- Owner: rupeshtr78
- Created: 2024-12-09T02:31:47.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-12-28T05:06:13.000Z (over 1 year ago)
- Last Synced: 2024-12-28T06:18:20.089Z (over 1 year ago)
- Topics: embeddings, llm, model, pg-vector, postgresql, rust, rust-lang, vector-database
- Language: Rust
- Homepage:
- Size: 93.8 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# PG Vector Embedding and Query Rust
This repository contains a Rust-based system for managing vector embeddings and querying them using a PostgreSQL-backed vector database. The system is designed to handle embedding generation, storage, and querying.
## Table of Contents
- [Overview](#overview)
- [Features](#features)
- [Directory Structure](#directory-structure)
- [Getting Started](#getting-started)
- [Prerequisites](#prerequisites)
- [Installation](#installation)
- [Running the Application](#running-the-application)
- [Usage](#usage)
- [Commands](#commands)
- [Configuration](#configuration)
- [Embedding and Querying](#embedding-and-querying)
- [Testing](#testing)
- [Contributing](#contributing)
- [License](#license)
## Overview
The system is composed of several modules that handle different aspects of the embedding and querying process:
- **Commands**: Handles command-line arguments and subcommands.
- **Config**: Manages configuration settings for embedding requests and database connections.
- **Constants**: Provides constant values used throughout the application.
- **Embedding**: Contains logic for generating embeddings and persisting them to the database.
- **VectorDB**: Handles interactions with the PostgreSQL database for storing and querying vector embeddings.
## Features
- **Embedding Generation**: Generate vector embeddings from input data.
- **Database Persistence**: Store embeddings in a PostgreSQL database.
- **Querying**: Query the database to find nearest neighbors based on vector embeddings.
## Directory Structure
```
src/
├── app/
│ ├── commands.rs
│ ├── config.rs
│ └── constants.rs
├── embedding/
│ ├── run_embedding.rs
│ └── vector_embedding.rs
├── main.rs
├── tests/
│ ├── setup_docker.rs
│ ├── test_pgclient.rs
│ ├── test_query_vector.rs
│ ├── test_run_embedding.rs
│ └── test_vector_embedding.rs
├── vectordb/
│ ├── pg_vector.rs
│ └── query_vector.rs
├── lib.rs
├── vectordb/mod.rs
├── tests/mod.rs
├── embedding/mod.rs
└── app/mod.rs
```
## Getting Started
### Prerequisites
- Rust (latest stable version)
- PostgreSQL Vector Db.
- Docker (for running tests)
- Active Ollama Service with `nomic-embed-text` or similar model.
### Installation
1. Clone the repository:
```sh
git clone https://github.com/rupeshtr78/pg-vector-embed-rust.git
cd pg-vector-embed-rust
```
2. Install dependencies:
```sh
cargo build
```
### Running the Application
1. Start the PostgreSQL vector database (if not already running).
2. Ollama service should be running with the specified model.
3. Run the application:
```sh
cargo run
```
## Usage
### Commands
The application supports various commands and subcommands. Use the `--help` flag to see available options:
```sh
cargo run -- --help
cargo run -- write --input "dog sound is called bark" --input "cat sounds is called purr" --model "nomic-embed-text" --table "from_rust2" --dim 768 --log-level "debug"
cargo run -- query --input "who is barking" --model "nomic-embed-text" --table "from_rust2"
```
### Configuration
Configuration settings for embedding requests and database connections are managed in `src/app/config.rs`. You can modify these settings as needed.
### Embedding and Querying
- **Generate Embeddings**: Use the `run_embedding` function to generate embeddings and persist them to the database.
- **Query Embeddings**: Use the `run_query` function to query the database for nearest neighbors based on vector embeddings.
## Testing
The test suite requires postgres vectordb and ollama with embedding model to be running in the correct configuration.
```sh
cargo test
```
## Contributing
Contributions are welcome! Please read the [CONTRIBUTING.md](CONTRIBUTING.md) file for details on how to contribute to this project.
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
---
Feel free to customize this README further based on your specific needs and project details.