{"id":24617761,"url":"https://github.com/jacoblincool/kvec","last_synced_at":"2026-04-10T16:56:22.035Z","repository":{"id":161166190,"uuid":"627870014","full_name":"JacobLinCool/kvec","owner":"JacobLinCool","description":"A modular semantic search stack for text, web page, image, and custom data types. API, GUI, Cache, Embedding, Storage, and more ...","archived":false,"fork":false,"pushed_at":"2024-10-20T02:25:51.000Z","size":2322,"stargazers_count":1,"open_issues_count":12,"forks_count":0,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-01-24T23:41:15.951Z","etag":null,"topics":["semantic-search"],"latest_commit_sha":null,"homepage":"https://kvec.pages.dev","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/JacobLinCool.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-04-14T11:36:04.000Z","updated_at":"2024-05-27T14:30:59.000Z","dependencies_parsed_at":null,"dependency_job_id":"f491945a-68f6-4f73-bfe7-c94baf4f26ea","html_url":"https://github.com/JacobLinCool/kvec","commit_stats":null,"previous_names":[],"tags_count":0,"template":true,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JacobLinCool%2Fkvec","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JacobLinCool%2Fkvec/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JacobLinCool%2Fkvec/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JacobLinCool%2Fkvec/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/JacobLinCool","download_url":"https://codeload.github.com/JacobLinCool/kvec/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244308719,"owners_count":20432255,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["semantic-search"],"created_at":"2025-01-24T23:40:41.436Z","updated_at":"2026-04-10T16:56:16.999Z","avatar_url":"https://github.com/JacobLinCool.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# KVec\n\nA modular semantic search stack.\n\nIt can be fully cloud-based with **Cloudflare KV** or **Upstash Redis**, **OpenAI** or **Cohere** Text Embedding, and **Pinecone** or **Qdrant Cloud** Vector Database.\n\nIt can also be self-hosted with **CouchDB**, **Qdrant** on Docker.\n\n![KVec icon, created with leonardo.ai](static/icon.png)\n\n## Features\n\n- Support for different item types: KVec supports **text**, **web page**, and **image** items out of the box, and you can easily extend it to support other types.\n- Modular architecture: KVec allows you to easily change components like the encoder, object store, and vector store to fit your specific use case.\n- Authentication: KVec supports authentication using JWT tokens.\n- Dashboard GUI: KVec comes with a simple dashboard that allows you to manage your items and issue authentication tokens.\n- RESTful API: KVec provides a simple API for creating, reading, deleting, and searching items.\n\n## Setup\n\n### Cloudflare Pages\n\nTo use [Cloudflare Pages](https://pages.cloudflare.com/) to host your KVec.\n\nYou should set up the following environment variables:\n\n- `APP_SECRET`: Secret for signing JWT tokens\n- `PINECONE_API_KEY`: To enable `PineconeVecStore` (see [Structure](#structure))\n- `PINECONE_ENDPOINT`: The endpoint of your Pinecone index, nessary for `PineconeVecStore`\n- `OPENAI_API_KEY`: To enable `OpenAIEncoder` (see [Structure](#structure))\n- `HF_API_TOKEN`: To enable `BaseTextAdapter`'s image feature\n\nAnd bind a KV namespace:\n\n- `KV`: KV namespace for storing items, nessary for `CloudflareKVObjStore` (see [Structure](#structure))\n\n\u003e You can simply:\n\u003e\n\u003e 1. Fork this repo.\n\u003e 2. Create a new Cloudflare Pages project, connect to your forked repo, and setup the environment variables.\n\u003e 3. Create a KV namespace and bind it to the project.\n\n### Docker\n\nTo use KVec with Docker, you'll need to setup environment variables in the `.env` file.\n\nYou may want something like this:\n\n```bash\nPORT=\"8080\"\nORIGIN=\"http://localhost:8080\"\nAPP_SECRET=\"my-kvec-secret\"\n\n# Adapter\nHF_API_TOKEN=\"hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\"\n\n# Encoder\nCOHERE_API_KEY=\"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\"\n\n# VecStore\nQDRANT_SERVER=\"http://qdrant:6333\"\nQDRANT_COLLECTION=\"kvec\"\n\n# ObjStore\nCOUCHDB_URL=\"http://couchdb:5984\"\nCOUCHDB_DB=\"kvec\"\nCOUCHDB_USER=\"admin\"\nCOUCHDB_PASSWORD=\"password\"\n```\n\nThen, you can run the following command to start:\n\n```bash\ndocker compose up -d\n```\n\nTo setup the Qdrant collection at the first time, you can run the following command:\n\n```bash\ncurl -X 'PUT' 'http://localhost:6333/collections/kvec' -H 'accept: application/json' -H 'content-type: application/json' -d '\n{\n  \"vectors\": {\n        \"size\": 4096,\n        \"distance\": \"Cosine\"\n    }\n}'\n# The vector size for Cohere's embedding is 4096, if you are using OpenAI's embedding, you should change it to 1536.\n```\n\nTo setup the CouchDB database at the first time, you can run the following command:\n\n```bash\ncurl -X PUT http://admin:password@localhost:5984/kvec \n```\n\nNow, you should be able to access the dashboard from `http://localhost:8080/`.\n\n## GUI\n\nYou can use the dashboard GUI to manage items and issue tokens.\n\nYou can access the dashboard from `https://kvec.yourdomain.com/`.\n\n## API\n\n### Authentication\n\nYou will need to get a JWT token to make requests to the item API.\n\nOne way to do this is to use the dashboard GUI.\n\nHowever, you can also issue tokens using the API itself:\n\n```bash\ncurl -X POST \\\n    -H \"Content-Type: application/json\" \\\n    -d '{ \"secret\": \"your-app-secret\", \"exp\": 3600, \"perm\": { \"read\": true, \"write\": false } }' \\\n    https://kvec.yourdomain.com/api/auth\n# Creates a token that expires in 1 hour and only allows read access\n```\n\n```json\n{\n    \"token\": \"ISSUED_JWT\"\n}\n```\n\n\u003e The token can be passed in the `Authorization` header of the request or the `kvec_token` cookie.\n\n### Item API\n\nThe item API allows you to create, read, delete, and search items.\n\n#### Create an item\n\n`write` permission is required.\n\n```bash\ncurl -X POST \\\n    -H \"Content-Type: application/json\" \\\n    -H \"Authorization: YOUR_TOKEN\" \\\n    -d '{ \"data\": { text: \"the content of the item\" } }' \\\n    https://kvec.yourdomain.com/api/item\n```\n\n```json\n{\n    \"id\": \"ITEM_ID\"\n}\n```\n\n\u003e This will create a new item with the text `the content of the item`.\n\nExamples for other types of items:\n\n\u003e Web page. The `BaseTextAdapter` will automatically fetch and use the page title and description as the page feature.\n\n```json\n{\n    \"data\": {\n        \"page\": \"https://github.com/JacobLinCool/kvec\"\n    }\n}\n```\n\n\u003e Image. `HF_API_TOKEN` environment variable is required.\n\u003e It uses Hugging Face's inference API to transform the image into text as the image feature.\n\u003e The default model is `nlpconnect/vit-gpt2-image-captioning`, but you can specify a different model by setting the `HF_IMGCAP_MODEL` environment variable.\n\u003e `http://`, `https://`, and `data:` are supported.\n\n```json\n{\n    \"data\": {\n        \"img\": \"https://kvec.pages.dev/icon.png\"\n    }\n}\n```\n\n#### Read an item\n\n`read` permission is required.\n\n```bash\ncurl -X GET \\\n    -H \"Content-Type: application/json\" \\\n    -H \"Authorization: YOUR_TOKEN\" \\\n    https://kvec.yourdomain.com/api/item/\u003cITEM_ID\u003e\n```\n\n```json\n{\n    \"item\": {\n        \"id\": \"ITEM_ID\",\n        \"data\": {\n            \"text\": \"the content of the item\"\n        },\n        \"meta\": {\n            \"type\": \"text\"\n        }\n    }\n}\n```\n\n#### Delete an item\n\n`write` permission is required.\n\n```bash\ncurl -X DELETE \\\n    -H \"Content-Type: application/json\" \\\n    -H \"Authorization: YOUR_TOKEN\" \\\n    https://kvec.yourdomain.com/api/item/\u003cITEM_ID\u003e\n```\n\n```json\n{\n    \"deleted\": true,\n    \"item\": {\n        \"id\": \"ITEM_ID\",\n        \"data\": {\n            \"text\": \"the content of the item\"\n        },\n        \"meta\": {\n            \"type\": \"text\"\n        }\n    }\n}\n```\n\n#### Search items\n\nIt performs a semantic search to find items that are similar to the query.\n\n`read` permission is required.\n\n```bash\ncurl -X GET \\\n    -H \"Content-Type: application/json\" \\\n    -H \"Authorization: YOUR_TOKEN\" \\\n    https://kvec.yourdomain.com/api/item?q=\u003cQUERY\u003e\n```\n\n```json\n{\n    \"items\": [\n        {\n            \"id\": \"ITEM_ID_1\",\n            \"data\": {\n                \"text\": \"the content of item 1\"\n            },\n            \"meta\": {\n                \"type\": \"text\"\n            }\n        },\n        {\n            \"id\": \"ITEM_ID_2\",\n            \"data\": {\n                \"text\": \"the content of item 2\",\n                \"page\": \"https://example.com/item2\"\n            },\n            \"meta\": {\n                \"type\": \"page\"\n            }\n        },\n        {\n            \"id\": \"ITEM_ID_3\",\n            \"data\": {\n                \"text\": \"the content of item 2\",\n                \"img\": \"https://example.com/item3.png\"\n            },\n            \"meta\": {\n                \"type\": \"img\"\n            }\n        }\n    ]\n}\n```\n\n## Structure\n\nThe KVec structure is mainly based on 6 components:\n\n- The **API** and GUI layer, which allows other services to interact with KVec easily and manage the authorizations.\n- The **Adapter** layer, which is responsible for adapting the data from the API layer to the encoder layer.\n- The **Encoder** layer, which is responsible for encoding the data into vectors (embeddings).\n- The **ObjStore** layer, which is responsible for storing the items itself.\n- The **VecStore** layer, which is responsible for storing the vectors, and performing the search.\n- The **Cache** layer, which is responsible for caching the search results.\n\nThe Adapter, Encoder, ObjStore, VecStore, and Cache layers are all pluggable, so you can easily customize them to fit your needs.\n\nCurrently, the following implementations are available:\n\n- **Adapter**\n  - [x] `BaseTextAdapter`: Support text, web page, and image _(env `HF_API_TOKEN` required)_ items.\n- **Encoder**\n  - [x] `OpenAIEncoder`: Use [OpenAI's `text-embedding-ada-002`](https://platform.openai.com/docs/guides/embeddings) to create embeddings\n  - [x] `CohereEncoder`: Use [Cohere](https://docs.cohere.ai/docs/embeddings) as the embedding service\n  - [x] `JustEncoder`: Only for local development\n- **ObjStore**\n  - [x] `CloudflareKVObjStore`: Use [Cloudflare KV](https://www.cloudflare.com/products/workers-kv/) as the object store backend\n  - [x] `UpstashRedisObjStore`: Use [Upstash Redis](https://upstash.com/) as the object store backend\n  - [x] `CouchDBObjStore`: Use [CouchDBObjStore](https://couchdb.apache.org/) as the object store backend\n  - [x] `MemoryObjStore`: Only for local development\n- **VecStore**\n  - [x] `PineconeVecStore`: Use [Pinecone](https://www.pinecone.io/) as the vector store backend\n  - [x] `QdrantVecStore`: Use [Qdrant](https://qdrant.tech/) as the vector store backend\n  - [x] `MemoryVecStore`: Only for local development\n- **Cache**\n  - [x] `CloudflareCache`: Use Cloudflare's Cache API\n  - [x] `MemoryCache`: Only for local development, it just \"don't cache anything\"\n\n\u003e The auto module will automatically load the correct implementation based on the environment variables.\n\u003e See [src/lib/server/auto/index.ts](./src/lib/server/auto/index.ts)\n\u003e You can also take a look at [the README of each module](./src/lib/server) to see how to configure them.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjacoblincool%2Fkvec","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjacoblincool%2Fkvec","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjacoblincool%2Fkvec/lists"}