An open API service indexing awesome lists of open source software.

https://github.com/hallelx2/vectorless-sdk

Official SDKs for Vectorless — structure-preserving document retrieval without embeddings (TypeScript + Python)
https://github.com/hallelx2/vectorless-sdk

Last synced: 21 days ago
JSON representation

Official SDKs for Vectorless — structure-preserving document retrieval without embeddings (TypeScript + Python)

Awesome Lists containing this project

README

          


Vectorless SDK

Vectorless SDK


Official client SDKs for Vectorless — structure-preserving document retrieval without embeddings.


npm
PyPI
Go
CI
License

---

## What is Vectorless?

Vectorless is a document retrieval engine that preserves document structure. Instead of chunking documents into fragments and embedding them in a vector database, Vectorless:

1. **Parses** documents into hierarchical trees (headings → sections)
2. **Summarizes** each section with an LLM
3. **Retrieves** by having the LLM reason over the tree outline to select the most relevant sections

No embeddings. No vector databases. Full sections returned with complete context.

## Architecture

```
┌─────────────────────────────────────────────────────────┐
│ Your Application │
├───────────┬───────────────┬─────────────────────────────┤
│ TypeScript│ Python │ Go │
│ SDK │ SDK │ SDK │
├───────────┴───────────────┴─────────────────────────────┤
│ Transport Layer (pick one) │
│ ┌──────────────────────┐ ┌──────────────────────────┐ │
│ │ HTTP/REST (default) │ │ ConnectRPC (protobuf) │ │
│ │ GET/POST /v1/* │ │ POST /{svc}/{method} │ │
│ │ SSE streaming │ │ Native streaming │ │
│ └──────────────────────┘ └──────────────────────────┘ │
├─────────────────────────────────────────────────────────┤
│ Vectorless Server │
│ ┌────────┐ ┌──────────┐ ┌─────────┐ ┌──────────────┐ │
│ │ Auth │ │ CORS │ │ Metrics│ │ Tracing │ │
│ └────────┘ └──────────┘ └─────────┘ └──────────────┘ │
├─────────────────────────────────────────────────────────┤
│ Vectorless Engine │
│ ┌─────────┐ ┌──────────┐ ┌────────┐ ┌─────────────┐ │
│ │ Parsers │ │ Tree │ │Retrieval│ │ LLM Gate │ │
│ │ MD HTML │ │ Builder │ │ single │ │ Anthropic │ │
│ │ PDF DOCX│ │ Summaries│ │ chunked │ │ OpenAI │ │
│ └─────────┘ └──────────┘ └────────┘ │ Gemini │ │
│ └─────────────┘ │
│ ┌─────────────┐ ┌────────────┐ ┌───────────────────┐ │
│ │ PostgreSQL │ │ S3/MinIO │ │ River/QStash │ │
│ │ docs+sections│ │ content │ │ job queue │ │
│ └─────────────┘ └────────────┘ └───────────────────┘ │
└─────────────────────────────────────────────────────────┘
```

## SDKs

### TypeScript

```bash
npm install vectorless
```

```typescript
import { VectorlessClient } from "vectorless";

const client = new VectorlessClient({
baseUrl: "https://api.vectorless.dev",
apiKey: "vl_live_...",
});

const result = await client.ingestDocument("./report.pdf");
const doc = await client.waitForReady(result.document_id);
const response = await client.query(doc.id, "How does auth work?");
```

[Full TypeScript docs →](./typescript/)

### Python

```bash
pip install vectorless-sdk
```

```python
from vectorless import VectorlessClient

client = VectorlessClient(
base_url="https://api.vectorless.dev",
api_key="vl_live_...",
)

result = client.ingest_document("./report.pdf")
doc = client.wait_for_ready(result.document_id)
response = client.query(doc.id, "How does auth work?")
```

[Full Python docs →](./python/)

### Go

```bash
go get github.com/hallelx2/vectorless-sdk/go
```

```go
import (
vectorless "github.com/hallelx2/vectorless-sdk/go"
_ "github.com/hallelx2/vectorless-sdk/go/transport"
)

client, _ := vectorless.NewClient(
vectorless.WithBaseURL("https://api.vectorless.dev"),
vectorless.WithAPIKey("vl_live_..."),
)
defer client.Close()

result, _ := client.IngestDocument(ctx, file, vectorless.IngestDocumentOptions{
Filename: "report.pdf",
})
doc, _ := client.WaitForReady(ctx, result.DocumentID, nil)
response, _ := client.Query(ctx, doc.ID, "How does auth work?", nil)
```

[Full Go docs →](./go/)

## Transport Protocols

All three SDKs support two wire protocols. Pick one at init time:

| Protocol | Default | Streaming | Dependencies |
|----------|---------|-----------|--------------|
| **HTTP/REST** | ✅ | SSE (`text/event-stream`) | None (built-in `fetch`/`httpx`/`net/http`) |
| **ConnectRPC** | — | Native Connect streaming | None (JSON encoding, no protobuf tooling) |

```typescript
// TypeScript
new VectorlessClient({ transport: "connect" });
```
```python
# Python
VectorlessClient(transport="connect")
```
```go
// Go
vectorless.NewClient(vectorless.WithTransport(vectorless.TransportConnect))
```

## Deploying the Server

The SDKs work with both **deployed** and **self-hosted** Vectorless instances:

```bash
# Self-hosted: no API key needed
client = VectorlessClient(base_url="http://localhost:8080")

# Deployed: API key required
client = VectorlessClient(
base_url="https://api.vectorless.dev",
api_key="vl_live_...",
)
```

## License

MIT