https://github.com/tm9657/serverless-cloudflare-search
Using Cloudflare Worker + Queues + R2 Storage + Cache to implement a small scale to zero search system that is reasonably fast and cheap.
https://github.com/tm9657/serverless-cloudflare-search
cloudflare search serverless
Last synced: 10 months ago
JSON representation
Using Cloudflare Worker + Queues + R2 Storage + Cache to implement a small scale to zero search system that is reasonably fast and cheap.
- Host: GitHub
- URL: https://github.com/tm9657/serverless-cloudflare-search
- Owner: TM9657
- License: other
- Created: 2023-04-04T13:47:13.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2023-05-23T11:24:12.000Z (over 2 years ago)
- Last Synced: 2025-03-26T11:44:52.914Z (11 months ago)
- Topics: cloudflare, search, serverless
- Language: TypeScript
- Homepage: https://tm9657.de
- Size: 178 KB
- Stars: 106
- Watchers: 4
- Forks: 2
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
README
# 🔍🌩️ Serverless Search on Cloudflare
Using Cloudflare Worker + Queues + R2 Storage + Cache to implement a small scale to zero search system that is reasonably fast and cheap.
Benchmark welcome for performance measure :)
Endpoints:
- **search** - public
- **index** - access restricted (see config)
Cached Index saved in R2. Cache read on search request.
Queue -> Writing Index (Batch size and concurrency 0)
## Features
- Generic Index support
- Multiple Parallel indices per endpoint (infinite)
- Good performance for smaller Datasets (up to 50k documents (I guess? Feel free to create a better benchmark!))
## Setup
create a .env file in your root with the following parameter:
```
CLOUDFLARE_AUTH_KEY=
CLOUDFLARE_AUTH_EMAIL=
```
> - `pnpm install` ➡️ populates your config with a strong secret
> - `pnpm run initialize` ➡️ creates the bucket and queue
> - `npx turbo build` ➡️ publishes your workers to cloudflare
## "Benchmark"
This project is meant for smaller datasets (cheap serverless search).
For a movie dataset with **17920 documents** a search takes *800ms first time* (downloading the index from R2), after that we get a worker performance of *50-60ms per search*.
## Todo
- Alternative Flexsearch implementation (Problems with export / import and types)
- Investigate Durable Object for faster initial response
- Add serverless setup for AWS deployment
- Add CLI tool
**Provided by TM9657 GmbH with ❤️**
### Check out some of our products:
- [Kwirk.io](https://kwirk.io?ref=github) (Text Editor with AI integration, privacy focus and offline support)
