https://github.com/databricks/zerobus-sdk
Databricks's Zerobus Ingest SDKs
https://github.com/databricks/zerobus-sdk
databricks databricks-sdk rust zerobus zerobus-sdk
Last synced: about 1 month ago
JSON representation
Databricks's Zerobus Ingest SDKs
- Host: GitHub
- URL: https://github.com/databricks/zerobus-sdk
- Owner: databricks
- License: other
- Created: 2025-09-03T13:11:21.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2026-03-02T14:51:53.000Z (about 1 month ago)
- Last Synced: 2026-03-02T14:53:12.888Z (about 1 month ago)
- Topics: databricks, databricks-sdk, rust, zerobus, zerobus-sdk
- Language: Rust
- Homepage:
- Size: 40.5 MB
- Stars: 24
- Watchers: 4
- Forks: 6
- Open Issues: 17
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Security: SECURITY.md
- Dco: DCO
Awesome Lists containing this project
README
# Zerobus SDKs
Monorepo for Databricks Zerobus Ingest SDKs.
## Disclaimer
[Public Preview](https://docs.databricks.com/release-notes/release-types.html): These SDKs are supported for production use cases and are available to all customers. Databricks is actively working on stabilizing the Zerobus Ingest SDKs. Minor version updates may include backwards-incompatible changes.
We are keen to hear feedback from you. Please [file issues](https://github.com/databricks/zerobus-sdk/issues), and we will address them.
## What is Zerobus?
Zerobus is a high-throughput streaming service for direct data ingestion into Databricks Delta tables, optimized for real-time data pipelines and high-volume workloads.
## SDKs
| Language | Directory | Package |
|----------|-----------|---------|
| Rust | [`rust/`](rust/) | [`databricks-zerobus-ingest-sdk`](https://crates.io/crates/databricks-zerobus-ingest-sdk) |
| Python | [`python/`](python/) | [`databricks-zerobus-ingest-sdk`](https://pypi.org/project/databricks-zerobus-ingest-sdk/) |
| Go | `go/` | *coming soon* |
| TypeScript | [`typescript/`](typescript/) | [`@databricks/zerobus-ingest-sdk`](https://www.npmjs.com/package/@databricks/zerobus-ingest-sdk) |
| Java | [`java/`](java/) | [`com.databricks:zerobus-ingest-sdk`](https://central.sonatype.com/artifact/com.databricks/zerobus-ingest-sdk) |
## Prerequisites
Before using any SDK, you need the following:
### 1. Workspace URL and Workspace ID
After logging into your Databricks workspace, look at the browser URL:
```
https://.cloud.databricks.com/o=
```
- **Workspace URL**: The part before `/o=` (e.g., `https://dbc-a1b2c3d4-e5f6.cloud.databricks.com`)
- **Workspace ID**: The part after `/o=` (e.g., `1234567890123456`)
> **Note:** The examples above show AWS endpoints (`.cloud.databricks.com`). For Azure deployments, the workspace URL will be `https://.azuredatabricks.net`.
### 2. Create a Delta Table
Create a table using Databricks SQL:
```sql
CREATE TABLE .default. (
device_name STRING,
temp INT,
humidity BIGINT
)
USING DELTA;
```
Replace `` with your catalog name (e.g., `main`).
### 3. Create a Service Principal
1. Navigate to **Settings > Identity and Access** in your Databricks workspace
2. Click **Service principals** and create a new service principal
3. Generate a new secret for the service principal and save it securely
4. Grant the following permissions:
- `USE_CATALOG` on the catalog (e.g., `main`)
- `USE_SCHEMA` on the schema (e.g., `default`)
- `MODIFY` and `SELECT` on the table
Grant permissions using SQL:
```sql
-- Grant catalog permission
GRANT USE CATALOG ON CATALOG TO ``;
-- Grant schema permission
GRANT USE SCHEMA ON SCHEMA .default TO ``;
-- Grant table permissions
GRANT SELECT, MODIFY ON TABLE .default. TO ``;
```
The service principal's **Application ID** is your OAuth **Client ID**, and the generated secret is your **Client Secret**.
## Serialization Formats
All SDKs support two serialization formats:
- **JSON** - Simple, schema-free ingestion. Pass a JSON string or native object (dict, map, etc.) and the SDK serializes it. No compilation step required. Good for getting started or dynamic schemas.
- **Protocol Buffers** - Strongly-typed, schema-validated ingestion. More efficient over the wire. Recommended for production workloads.
### Protocol Buffers
Use `proto2` syntax with `optional` fields to correctly represent nullable Delta table columns.
#### Delta → Protobuf Type Mappings
| Delta Type | Proto2 Type |
|-----------|-------------|
| TINYINT, BYTE, INT, SMALLINT, SHORT | int32 |
| BIGINT, LONG | int64 |
| FLOAT | float |
| DOUBLE | double |
| STRING, VARCHAR | string |
| BOOLEAN | bool |
| BINARY | bytes |
| DATE | int32 |
| TIMESTAMP, TIMESTAMP_NTZ | int64 |
| ARRAY\ | repeated type |
| MAP\ | map\ |
| STRUCT\ | nested message |
| VARIANT | string (JSON string) |
### Schema Generation
Instead of writing `.proto` files by hand, each SDK ships a tool to generate protobuf schemas directly from an existing Unity Catalog table. See the individual SDK READMEs for language-specific usage.
## HTTP Proxy Support
All SDKs support HTTP CONNECT proxies via environment variables, following gRPC core conventions. The first variable found (in order) is used:
| Proxy | No-proxy |
|-------|----------|
| `grpc_proxy` / `GRPC_PROXY` | `no_grpc_proxy` / `NO_GRPC_PROXY` |
| `https_proxy` / `HTTPS_PROXY` | `no_proxy` / `NO_PROXY` |
| `http_proxy` / `HTTP_PROXY` | |
The `no_proxy` value is a comma-separated list of hostnames (suffix-matched) or `*` to bypass the proxy entirely.
```bash
export https_proxy=http://my-proxy:8080
export no_proxy=localhost,127.0.0.1
```
The SDK establishes a plaintext HTTP CONNECT tunnel through the proxy, then performs a TLS handshake end-to-end with the Databricks server. The proxy never sees decrypted traffic.
## Contributing
See [CONTRIBUTING.md](CONTRIBUTING.md). Each SDK also has its own contributing guide with language-specific setup instructions.
## License
This project is licensed under the Databricks License. See [LICENSE](LICENSE) for the full text.