https://github.com/tm9657/serverless-cloudflare-search

Using Cloudflare Worker + Queues + R2 Storage + Cache to implement a small scale to zero search system that is reasonably fast and cheap.
https://github.com/tm9657/serverless-cloudflare-search

cloudflare search serverless

Last synced: 10 months ago
JSON representation

Using Cloudflare Worker + Queues + R2 Storage + Cache to implement a small scale to zero search system that is reasonably fast and cheap.

Host: GitHub
URL: https://github.com/tm9657/serverless-cloudflare-search
Owner: TM9657
License: other
Created: 2023-04-04T13:47:13.000Z (almost 3 years ago)
Default Branch: main
Last Pushed: 2023-05-23T11:24:12.000Z (over 2 years ago)
Last Synced: 2025-03-26T11:44:52.914Z (11 months ago)
Topics: cloudflare, search, serverless
Language: TypeScript
Homepage: https://tm9657.de
Size: 178 KB
Stars: 106
Watchers: 4
Forks: 2
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE.txt

Awesome Lists containing this project

README

# 🔍🌩️ Serverless Search on Cloudflare
Using Cloudflare Worker + Queues + R2 Storage + Cache to implement a small scale to zero search system that is reasonably fast and cheap.
Benchmark welcome for performance measure :)

Endpoints:
- **search** - public
- **index** - access restricted (see config)

Cached Index saved in R2. Cache read on search request.
Queue -> Writing Index (Batch size and concurrency 0)

## Features
- Generic Index support
- Multiple Parallel indices per endpoint (infinite)
- Good performance for smaller Datasets (up to 50k documents (I guess? Feel free to create a better benchmark!))

## Setup
create a .env file in your root with the following parameter:
```
CLOUDFLARE_AUTH_KEY=
CLOUDFLARE_AUTH_EMAIL=
```

> - `pnpm install` ➡️ populates your config with a strong secret
> - `pnpm run initialize` ➡️ creates the bucket and queue
> - `npx turbo build` ➡️ publishes your workers to cloudflare

## "Benchmark"
This project is meant for smaller datasets (cheap serverless search).
For a movie dataset with **17920 documents** a search takes *800ms first time* (downloading the index from R2), after that we get a worker performance of *50-60ms per search*.

## Todo
- Alternative Flexsearch implementation (Problems with export / import and types)
- Investigate Durable Object for faster initial response
- Add serverless setup for AWS deployment
- Add CLI tool

**Provided by TM9657 GmbH with ❤️**
### Check out some of our products:
- [Kwirk.io](https://kwirk.io?ref=github) (Text Editor with AI integration, privacy focus and offline support)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/tm9657/serverless-cloudflare-search

Awesome Lists containing this project

README