https://github.com/hunkim/solar-vector-store
https://github.com/hunkim/solar-vector-store
Last synced: 7 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/hunkim/solar-vector-store
- Owner: hunkim
- Created: 2025-05-18T05:29:37.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2025-05-18T05:34:04.000Z (8 months ago)
- Last Synced: 2025-05-18T06:27:33.855Z (8 months ago)
- Language: Python
- Size: 0 Bytes
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Solar Vector Store
A FastAPI server that implements an OpenAI‐style Vector Stores API using:
1. **Upstage Document Digitization** to parse PDFs (or other files) into text
2. **Upstage Solar Embeddings** to turn text into vectors
3. **Qdrant** (cloud or local) to store and search the vectors
---
## Features
- **Vector Store Management**
Create, list, retrieve, update (distance metric), and delete named stores (Qdrant collections)
- **File Ingestion**
Upload a PDF (or other file), parse it into pages via Upstage, embed each page, and upsert into Qdrant
- **Search**
Embed a text query and perform a **k-NN** search over the stored vectors
- **File & Store Cleanup**
Delete individual ingested files or entire vector stores
---
## Prerequisites
- **Python 3.10+**
- **Upstage API Key** (for Document Digitization & Embeddings)
- **Qdrant** (cloud or self-hosted) URL & API Key (if you use Qdrant Cloud)
---
## Getting Started
1. **Clone the repo**
```bash
git clone https://github.com/your-org/solar-vector-store.git
cd solar-vector-store
```
2. **Create & activate virtual environment**
```bash
make venv
source .venv/bin/activate
```
3. **Install dependencies**
```bash
make install
```
4. **Configure environment**
Copy `.env.example` → `.env`, then set:
```
UPSTAGE_API_KEY=up_...
QDRANT_URL=https://
QDRANT_API_KEY= # optional for self-hosted
```
5. **Run the server**
```bash
make run
```
The API will be available at `http://localhost:8000` and interactive docs at `http://localhost:8000/docs`.
---
## API Endpoints
| Method | Path | Description |
| ------ | ------------------------------------------- | ------------------------------------------- |
| POST | `/vector_stores` | Create a new vector store |
| GET | `/vector_stores` | List all vector stores |
| GET | `/vector_stores/{store_id}` | Retrieve one vector store’s metadata |
| PATCH | `/vector_stores/{store_id}` | Update name or distance metric |
| DELETE | `/vector_stores/{store_id}` | Delete a vector store |
| POST | `/vector_stores/{store_id}/files` | Upload & ingest a file |
| GET | `/vector_stores/{store_id}/files` | List ingested files |
| GET | `/vector_stores/{store_id}/files/{file_id}` | Retrieve file metadata |
| DELETE | `/vector_stores/{store_id}/files/{file_id}` | Delete an ingested file’s vectors & metadata|
| POST | `/vector_stores/{store_id}/query` | Query the store with a text embedding |
---
## Testing
1. Place a sample `test.pdf` in the `tests/` directory.
2. Run the test suite:
```bash
make test
```
---
## License
This project is licensed under the MIT License.