https://github.com/query-farm/tributary
A DuckDB Extension for Kafka
https://github.com/query-farm/tributary
apache-kafka apache-kafka-consumer apache-kafka-producer duckdb duckdb-extension kafka streaming
Last synced: 8 months ago
JSON representation
A DuckDB Extension for Kafka
- Host: GitHub
- URL: https://github.com/query-farm/tributary
- Owner: Query-farm
- License: mit
- Created: 2025-06-08T02:28:11.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2025-06-12T03:35:06.000Z (11 months ago)
- Last Synced: 2025-06-12T04:31:07.025Z (11 months ago)
- Topics: apache-kafka, apache-kafka-consumer, apache-kafka-producer, duckdb, duckdb-extension, kafka, streaming
- Language: C++
- Homepage: https://query.farm/duckdb_extension_tributary.html
- Size: 23.4 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: docs/README.md
- License: LICENSE
Awesome Lists containing this project
README
# DuckDB Tributary Extension
The **Tributary** extension provides seamless integration between DuckDB and [Apache Kafka](https://kafka.apache.org/), enabling real-time querying and analysis of streaming data. With this extension, users can consume messages directly from Kafka topics into DuckDB for immediate processing, as well as write processed data back to Kafka streams.
## Key Features
- **Direct Kafka Ingestion:** Stream records from Kafka topics directly into DuckDB tables using SQL.
- **Flexible Topic Consumption:** Supports consuming from specific partitions, offsets, or continuously from the latest messages.
- **Real-Time Analytics:** Perform analytical queries on streaming data as it arrives.
- **Kafka Output:** Optionally write results or processed data back to Kafka topics.
- **SQL-Native Interface:** Kafka integration is fully accessible via SQL, enabling easy adoption for data engineers and analysts.
## Example Usage
```sql
-- Scan an entire Kafka topic.
SELECT *
FROM tributary_scan_topic('test-topic',
"bootstrap.servers" := 'localhost:9092'
);
```
# Documentation
See the [extension documentation](https://query.farm/duckdb_extension_tributary.html).
## Building
### Managing dependencies
DuckDB extensions uses VCPKG for dependency management. Enabling VCPKG is very simple: follow the [installation instructions](https://vcpkg.io/en/getting-started) or just run the following:
```shell
cd
git clone https://github.com/Microsoft/vcpkg.git
sh ./vcpkg/scripts/bootstrap.sh -disableMetrics
export VCPKG_TOOLCHAIN_PATH=`pwd`/vcpkg/scripts/buildsystems/vcpkg.cmake
```
Note: VCPKG is only required for extensions that want to rely on it for dependency management. If you want to develop an extension without dependencies, or want to do your own dependency management, just skip this step. Note that the example extension uses VCPKG to build with a dependency for instructive purposes, so when skipping this step the build may not work without removing the dependency.
### Build steps
Now to build the extension, run:
```sh
make
```
The main binaries that will be built are:
```sh
./build/release/duckdb
./build/release/test/unittest
./build/release/extension//.duckdb_extension
```
- `duckdb` is the binary for the duckdb shell with the extension code automatically loaded.
- `unittest` is the test runner of duckdb. Again, the extension is already linked into the binary.
- `.duckdb_extension` is the loadable binary as it would be distributed.
### Tips for speedy builds
DuckDB extensions currently rely on DuckDB's build system to provide easy testing and distributing. This does however come at the downside of requiring the template to build DuckDB and its unittest binary every time you build your extension. To mitigate this, we highly recommend installing [ccache](https://ccache.dev/) and [ninja](https://ninja-build.org/). This will ensure you only need to build core DuckDB once and allows for rapid rebuilds.
To build using ninja and ccache ensure both are installed and run:
```sh
GEN=ninja make
```
## Running the extension
To run the extension code, simply start the shell with `./build/release/duckdb`. This shell will have the extension pre-loaded.
```
## Running the tests
Different tests can be created for DuckDB extensions. The primary way of testing DuckDB extensions should be the SQL tests in `./test/sql`. These SQL tests can be run using:
```sh
make test
```