https://github.com/sdairs/bluebird
Pushes the Bluesky Firehose to various data services
https://github.com/sdairs/bluebird
bluesky kafka redpanda tinybird
Last synced: 7 months ago
JSON representation
Pushes the Bluesky Firehose to various data services
- Host: GitHub
- URL: https://github.com/sdairs/bluebird
- Owner: sdairs
- License: mit
- Created: 2024-11-16T15:54:24.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-02-03T11:47:26.000Z (about 1 year ago)
- Last Synced: 2025-03-24T13:02:22.723Z (11 months ago)
- Topics: bluesky, kafka, redpanda, tinybird
- Language: TypeScript
- Homepage:
- Size: 395 KB
- Stars: 8
- Watchers: 1
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Bluebird
Bluebird is a CLI that consumes the Bluesky firehose and sends it to a downstream destination.
## Destinations
- Tinybird
- Kafka
- ClickHouse
- Timeplus
## Usage
You can use `npx` (or `pnpm dlx`) to run the CLI without installing it.
Or use `npm install -g @sdairs/bluebird` to install it globally.
Alternatively you can use the provided docker image `ghcr.io/sdairs/bluebird:latest` to run it. You can either pass arguments on the command line (shown below) or use environment variables to control the behavior. See [documentation here](USING_DOCKER.md) for more details.
### Tinybird
```
npx @sdairs/bluebird start tinybird --token e.XXX --endpoint https://api.tinybird.co --datasource bluebird_feed
```
```
docker run --rm ghcr.io/sdairs/bluebird:latest start tinybird --token e.XXX --endpoint https://api.tinybird.co --datasource bluebird_feed
```
### Kafka
```
npx @sdairs/bluebird start kafka --brokers broker:9092 --topic bluebird --username user --password pass --sasl-mechanism scram-sha-512 --batch-size 819200
```
```
docker run --rm ghcr.io/sdairs/bluebird:latest start kafka --brokers broker:9092 --topic bluebird --username user --password pass --sasl-mechanism scram-sha-512 --batch-size 819200
```
### ClickHouse
```
npx @sdairs/bluebird start clickhouse --url http://localhost:8123 --database default --table bluebird
```
```
docker run --rm ghcr.io/sdairs/bluebird:latest start clickhouse --url http://localhost:8123 --database default --table bluebird
```
### Timeplus
You can create a free account at https://us-west-2.timeplus.cloud, then follow the guide to create the API token: https://docs.timeplus.com/apikey. The `stream` will be created automatically if not exists. This also works with self-hosting Timeplus Enterprise, please set the token as `username:password`.
```
npx @sdairs/bluebird start timeplus --token XXX --endpoint https://us-west-2.timeplus.cloud/ws_id --stream bluebird
```
```
docker run --rm ghcr.io/sdairs/bluebird:latest start timeplus --token XXX --endpoint https://us-west-2.timeplus.cloud/ws_id --stream bluebird
```
## CLI development
The CLI is built with [oclif](https://oclif.io).
### Writing a new destination
Add a new directory under `src/destinations`, e.g., `src/destinations/my_destination`.
Create your destination class, e.g., `src/destinations/my_destination/my_destination.ts`. This must export a class that extends `Destination` from `src/destinations/base.ts`.
There are two methods you can override, `init` and `send`.
- `init`: **optional** - called once when the destination is first created
- `send`: **required** - called every time a batch is ready to be processed. This is where you should handle sending events to the downstream destination.
Here's the template to start from:
```
import { Destination } from '../base.js';
import { Event } from '../../lib/types.js';
interface MyDestinationConfig {
someConfig: string
}
export class MyDestination extends Destination {
private config: MyDestinationConfig;
constructor(config: MyDestinationConfig) {
super();
this.config = config;
}
async init(): Promise {
}
async send(events: Event[]): Promise {
}
}
```
### Adding the destination to the CLI start command
Add the new destination as a subcommandto the `start` topic. Create a new file in `src/commands/start/` named after your destination, e.g., `src/commands/start/my_destination.ts`.
Here's the template to start from:
```
import { Flags } from '@oclif/core';
import { BaseStartCommand } from './base.js';
import { MyDestination } from '../../destinations/my_destination/my_destination.js';
import { Destination } from '../../destinations/base.js';
export default class StartMyDestination extends BaseStartCommand {
static description = 'Send the Bluebird feed to My Destination';
static examples = [
'<%= config.bin %> <%= command.id %> --some-config XXX',
];
static flags = {
someConfig: Flags.string({
description: 'Some flag',
char: 'c',
required: true,
}),
};
protected createDestination(flags: Record): Destination {
return new MyDestination({
someConfig: flags.someConfig
});
}
}
```
### Building the CLI for dev
Install deps and build the CLI:
```
pnpm install
pnpm build
```
### Running the CLI for dev
From within the `cli` directory:
```
./bin/dev.js start [OPTS]
```
For example:
```
./bin/dev.js start tinybird --token e.XXX --endpoint https://api.tinybird.co --datasource bluebird_feed
```
Note: you need to do this inside the `cli` dir if you don't have `ts-node` installed globally. If you see a module not found error for `ts-node`, this is why.
## Contributing
Just submit a PR - all contributions are welcome!