https://github.com/databricks/zerobus-sdk
Databricks's Zerobus Ingest SDKs
https://github.com/databricks/zerobus-sdk
databricks databricks-sdk ffi go java jni napi-rs pyo3 python rust typescript zerobus zerobus-sdk
Last synced: 3 days ago
JSON representation
Databricks's Zerobus Ingest SDKs
- Host: GitHub
- URL: https://github.com/databricks/zerobus-sdk
- Owner: databricks
- License: apache-2.0
- Created: 2025-09-03T13:11:21.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2026-04-21T05:06:46.000Z (6 days ago)
- Last Synced: 2026-04-21T06:42:01.081Z (6 days ago)
- Topics: databricks, databricks-sdk, ffi, go, java, jni, napi-rs, pyo3, python, rust, typescript, zerobus, zerobus-sdk
- Language: Rust
- Homepage:
- Size: 93.5 MB
- Stars: 73
- Watchers: 4
- Forks: 12
- Open Issues: 72
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Security: SECURITY.md
- Dco: DCO
Awesome Lists containing this project
README
# Zerobus SDKs
Monorepo for Databricks Zerobus Ingest SDKs.
## Disclaimer
[GA](https://docs.databricks.com/release-notes/release-types.html): This SDK is generally available and supported for production use cases. Minor and patch version updates will not contain breaking changes. Major version updates may include breaking changes.
We are keen to hear feedback from you. Please [file issues](https://github.com/databricks/zerobus-sdk/issues), and we will address them.
## What is Zerobus?
Zerobus is a high-throughput streaming service for direct data ingestion into Databricks Delta tables, optimized for real-time data pipelines and high-volume workloads.
## SDKs
| Language | Directory | Package |
|----------|-----------|---------|
| Rust | [`rust/`](rust/) | [`databricks-zerobus-ingest-sdk`](https://crates.io/crates/databricks-zerobus-ingest-sdk) |
| Python | [`python/`](python/) | [`databricks-zerobus-ingest-sdk`](https://pypi.org/project/databricks-zerobus-ingest-sdk/) |
| Go | [`go/`](go/) | [`github.com/databricks/zerobus-sdk/go`](https://pkg.go.dev/github.com/databricks/zerobus-sdk/go) |
| TypeScript | [`typescript/`](typescript/) | [`@databricks/zerobus-ingest-sdk`](https://www.npmjs.com/package/@databricks/zerobus-ingest-sdk) |
| Java | [`java/`](java/) | [`com.databricks:zerobus-ingest-sdk`](https://central.sonatype.com/artifact/com.databricks/zerobus-ingest-sdk) |
## Platform Support
We try to provide prebuilt native binaries for the following platforms:
| Platform | Architecture |
|----------|-------------|
| Linux | x86_64 |
| Linux | aarch64 |
| Windows | x86_64 |
| macOS | x86_64 |
| macOS | aarch64 (Apple Silicon) |
> **Note:** We do not currently have macOS CI runners, so macOS binaries are built locally and may not be available for every SDK or release. If your platform is not supported or you encounter compatibility issues, you can [build from source](CONTRIBUTING.md) or [file an issue](https://github.com/databricks/zerobus-sdk/issues).
## Prerequisites
Before using any SDK, you need the following:
### 1. Workspace URL and Workspace ID
After logging into your Databricks workspace, look at the browser URL:
```
https://.cloud.databricks.com/o=
```
- **Workspace URL**: The part before `/o=` (e.g., `https://dbc-a1b2c3d4-e5f6.cloud.databricks.com`)
- **Workspace ID**: The part after `/o=` (e.g., `1234567890123456`)
> **Note:** The examples above show AWS endpoints (`.cloud.databricks.com`). For Azure deployments, the workspace URL will be `https://.azuredatabricks.net`.
### 2. Create a Delta Table
Create a table using Databricks SQL:
```sql
CREATE TABLE .default. (
device_name STRING,
temp INT,
humidity BIGINT
)
USING DELTA;
```
Replace `` with your catalog name (e.g., `main`).
### 3. Create a Service Principal
1. Navigate to **Settings > Identity and Access** in your Databricks workspace
2. Click **Service principals** and create a new service principal
3. Generate a new secret for the service principal and save it securely
4. Grant the following permissions:
- `USE_CATALOG` on the catalog (e.g., `main`)
- `USE_SCHEMA` on the schema (e.g., `default`)
- `MODIFY` and `SELECT` on the table
Grant permissions using SQL:
```sql
-- Grant catalog permission
GRANT USE CATALOG ON CATALOG TO ``;
-- Grant schema permission
GRANT USE SCHEMA ON SCHEMA .default TO ``;
-- Grant table permissions
GRANT SELECT, MODIFY ON TABLE .default. TO ``;
```
The service principal's **Application ID** is your OAuth **Client ID**, and the generated secret is your **Client Secret**.
## Serialization Formats
All SDKs support two serialization formats:
- **JSON** - Simple, schema-free ingestion. Pass a JSON string or native object (dict, map, etc.) and the SDK serializes it. No compilation step required. Good for getting started or dynamic schemas.
- **Protocol Buffers** - Strongly-typed, schema-validated ingestion. More efficient over the wire. Recommended for production workloads.
### Protocol Buffers
Use `proto2` syntax with `optional` fields to correctly represent nullable Delta table columns.
#### Delta → Protobuf Type Mappings
| Delta Type | Proto2 Type |
|-----------|-------------|
| TINYINT, BYTE, INT, SMALLINT, SHORT | int32 |
| BIGINT, LONG | int64 |
| FLOAT | float |
| DOUBLE | double |
| STRING, VARCHAR | string |
| BOOLEAN | bool |
| BINARY | bytes |
| DATE | int32 |
| TIMESTAMP, TIMESTAMP_NTZ | int64 |
| ARRAY\ | repeated type |
| MAP\ | map\ |
| STRUCT\ | nested message |
| VARIANT | string (JSON string) |
### Schema Generation
Instead of writing `.proto` files by hand, each SDK ships a tool to generate protobuf schemas directly from an existing Unity Catalog table. See the individual SDK READMEs for language-specific usage.
## HTTP Proxy Support
All SDKs support HTTP CONNECT proxies via environment variables, following gRPC core conventions. The first variable found (in order) is used:
| Proxy | No-proxy |
|-------|----------|
| `grpc_proxy` / `GRPC_PROXY` | `no_grpc_proxy` / `NO_GRPC_PROXY` |
| `https_proxy` / `HTTPS_PROXY` | `no_proxy` / `NO_PROXY` |
| `http_proxy` / `HTTP_PROXY` | |
The `no_proxy` value is a comma-separated list of hostnames (suffix-matched) or `*` to bypass the proxy entirely.
```bash
export https_proxy=http://my-proxy:8080
export no_proxy=localhost,127.0.0.1
```
The SDK establishes a plaintext HTTP CONNECT tunnel through the proxy, then performs a TLS handshake end-to-end with the Databricks server. The proxy never sees decrypted traffic.
## Contributing
See [CONTRIBUTING.md](CONTRIBUTING.md). Each SDK also has its own contributing guide with language-specific setup instructions.
## License
This project is licensed under the Apache License 2.0. See [LICENSE](LICENSE) for the full text.