{"id":35777200,"url":"https://github.com/databendlabs/databend-loki-adapter","last_synced_at":"2026-01-07T05:17:19.890Z","repository":{"id":329262635,"uuid":"1117321115","full_name":"databendlabs/databend-loki-adapter","owner":"databendlabs","description":"Loki-compatible HTTP API which convert LogQL queries to Databend SQL, and return Loki-formatted JSON responses.","archived":false,"fork":false,"pushed_at":"2025-12-29T05:19:42.000Z","size":191,"stargazers_count":2,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-01-01T00:57:28.809Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/databendlabs.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-12-16T06:35:19.000Z","updated_at":"2025-12-29T05:19:46.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/databendlabs/databend-loki-adapter","commit_stats":null,"previous_names":["databendlabs/databend-loki-adapter"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/databendlabs/databend-loki-adapter","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/databendlabs%2Fdatabend-loki-adapter","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/databendlabs%2Fdatabend-loki-adapter/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/databendlabs%2Fdatabend-loki-adapter/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/databendlabs%2Fdatabend-loki-adapter/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/databendlabs","download_url":"https://codeload.github.com/databendlabs/databend-loki-adapter/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/databendlabs%2Fdatabend-loki-adapter/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28232804,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2026-01-07T02:00:05.975Z","response_time":58,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-01-07T05:17:12.787Z","updated_at":"2026-01-07T05:17:19.877Z","avatar_url":"https://github.com/databendlabs.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Databend Loki Adapter\n\nDatabend Loki Adapter exposes a minimal Loki-compatible HTTP API. It parses LogQL queries from Grafana, converts them to Databend SQL, runs the statements, and returns Loki-formatted JSON responses.\n\n## Getting Started\n\n```bash\nexport DATABEND_DSN=\"databend://user:pass@host:port/default\"\ndatabend-loki-adapter --table nginx_logs --schema-type flat\n```\n\nThe adapter listens on `--bind` (default `0.0.0.0:3100`) and exposes a minimal subset of the Loki HTTP surface area.\n\n## Configuration\n\n| Flag                   | Env                  | Default                 | Description                                                                |\n| ---------------------- | -------------------- | ----------------------- | -------------------------------------------------------------------------- |\n| `--mode`               | `ADAPTER_MODE`       | `standalone`            | `standalone` uses a fixed DSN/table, `proxy` pulls both from HTTP headers. |\n| `--dsn`                | `DATABEND_DSN`       | _required_              | Databend DSN with credentials (`proxy` mode expects it via header).        |\n| `--table`              | `LOGS_TABLE`         | `logs`                  | Target table. Use `db.table` or rely on the DSN default database.          |\n| `--bind`               | `BIND_ADDR`          | `0.0.0.0:3100`          | HTTP bind address.                                                         |\n| `--schema-type`        | `SCHEMA_TYPE`        | `loki`                  | `loki` (labels as VARIANT) or `flat` (wide table).                         |\n| `--timestamp-column`   | `TIMESTAMP_COLUMN`   | auto-detect             | Override the timestamp column name.                                        |\n| `--line-column`        | `LINE_COLUMN`        | auto-detect             | Override the log line column name.                                         |\n| `--labels-column`      | `LABELS_COLUMN`      | auto-detect (loki only) | Override the labels column name.                                           |\n| `--max-metric-buckets` | `MAX_METRIC_BUCKETS` | `240`                   | Maximum bucket count per metric range query before clamping `step`.        |\n\n## Run Modes\n\n`databend-loki-adapter` runs in two modes:\n\n- **standalone** (default): supply `--dsn`, `--table`, and schema overrides on the CLI/env. The adapter resolves the table once at startup and caches the schema for the entire process lifetime.\n- **proxy**: launch the server with `--mode proxy` and omit `--dsn`. Each HTTP request must pass the Databend DSN and schema definition in headers so multiple tenants/tables can share one adapter instance.\n\n### Proxy headers\n\nProxy mode expects two headers on every request:\n\n| Header              | Purpose                                                                                          |\n| ------------------- | ------------------------------------------------------------------------------------------------ |\n| `X-Databend-Dsn`    | Databend DSN, e.g. `databend://user:pass@host:8000/default`.                                     |\n| `X-Databend-Schema` | JSON document that tells the adapter which table/columns to treat as Loki timestamp/labels/line. |\n\n`X-Databend-Schema` accepts the fields listed below. `schema_type` must be `loki` or `flat`. All column names are case-insensitive and must match the physical Databend table. Set `table` to either `db.table` or just the table name (the latter uses the DSN's default database).\n\n###### Loki schema example\n\n```json\n{\n  \"table\": \"default.logs\",\n  \"schema_type\": \"loki\",\n  \"timestamp_column\": \"timestamp\",\n  \"line_column\": \"line\",\n  \"labels_column\": \"labels\"\n}\n```\n\n###### Flat schema example\n\n```json\n{\n  \"table\": \"analytics.nginx_logs\",\n  \"schema_type\": \"flat\",\n  \"timestamp_column\": \"timestamp\",\n  \"line_column\": \"request\",\n  \"label_columns\": [\n    { \"name\": \"host\" },\n    { \"name\": \"status\", \"numeric\": true },\n    { \"name\": \"client\" }\n  ]\n}\n```\n\n`label_columns` is required for `flat` schemas and each entry can mark `numeric: true` to treat the label as a numeric column for selectors.\n\n## Schema Support\n\nThe adapter inspects the table via `system.columns` during startup and then maps the physical layout into Loki's timestamp/line/label model. Two schema styles are supported. The SQL snippets below are reference starting points rather than strict requirements -- feel free to rename columns, tweak indexes, or add computed fields as long as the final table exposes the required timestamp/line/label information. Use the CLI overrides (`--timestamp-column`, `--line-column`, `--labels-column`) if your column names differ.\n\n### Loki schema\n\nUse this schema when you already store labels as a serialized structure (VARIANT/MAP) alongside the log body. The adapter expects a timestamp column, a VARIANT/MAP column containing a JSON object of labels, and a string column for the log line or payload. Additional helper columns (hashes, shards, etc.) are ignored.\n\nRecommended layout (adjust column names, clustering keys, and indexes to match your workload):\n\n```sql\nCREATE TABLE logs (\n  `timestamp` TIMESTAMP NOT NULL,\n  `labels` VARIANT NOT NULL,\n  `line` STRING NOT NULL,\n  `stream_hash` UInt64 NOT NULL AS (city64withseed(labels, 0)) STORED\n) CLUSTER BY (to_start_of_hour(timestamp), stream_hash);\n\nCREATE INVERTED INDEX logs_line_idx ON logs(line);\n```\n\n- `timestamp`: log event timestamp.\n- `labels`: VARIANT/MAP storing serialized Loki labels.\n- `line`: raw log line.\n- `stream_hash`: computed hash of the label set; useful for clustering or fast equality filters on a stream.\n- `CREATE INVERTED INDEX`: defined separately as required by Databend's inverted-index syntax.\n\nExtra optimizations (optional but recommended, mix and match as needed):\n\n### Flat schema\n\nUse this schema when logs arrive in a wide table where each attribute is already a separate column. The adapter chooses the timestamp column, picks one string column for the log line (either auto-detected or provided via `--line-column`), and treats every other column as a label. The examples below illustrate common shapes; substitute your own column names and indexes.\n\n```sql\nCREATE TABLE nginx_logs (\n  `agent` STRING,\n  `client` STRING,\n  `host` STRING,\n  `path` STRING,\n  `protocol` STRING,\n  `refer` STRING,\n  `request` STRING,\n  `size` INT,\n  `status` INT,\n  `timestamp` TIMESTAMP NOT NULL\n) CLUSTER BY (to_start_of_hour(timestamp), host, status);\n```\n\n```sql\nCREATE TABLE kubernetes_logs (\n  `message` STRING,\n  `log_time` TIMESTAMP NOT NULL,\n  `pod_name` STRING,\n  `pod_namespace` STRING,\n  `cluster_name` STRING\n) CLUSTER BY (to_start_of_hour(log_time), cluster_name, pod_namespace, pod_name);\n\nCREATE INVERTED INDEX k8s_message_idx ON kubernetes_logs(message);\n```\n\nGuidelines:\n\n- If the table does not have an obvious log-line column, pass `--line-column` (e.g., `--line-column request` for `nginx_logs`, or `--line-column message` for `kubernetes_logs`). The column may be nullable; the adapter will emit empty strings when needed.\n- Every other column automatically becomes a LogQL label. These columns hold the actual metadata you want to query (`client`, `host`, `status`, `pod_name`, `pod_namespace`, `cluster_name`, etc.). Use Databend's SQL to rename or cast fields if you need canonical label names.\n\n  ```sql\n  CREATE INVERTED INDEX nginx_request_idx ON nginx_logs(request);\n  CREATE INVERTED INDEX k8s_message_idx ON kubernetes_logs(message);\n  ```\n\n## Index guidance\n\nDatabend only requires manual management for inverted indexes. See the official docs for [inverted indexes](https://docs.databend.com/sql/sql-commands/ddl/inverted-index/) and the dedicated [`CREATE INVERTED INDEX`](https://docs.databend.com/sql/sql-commands/ddl/inverted-index/create-inverted-index) and [`REFRESH INVERTED INDEX`](https://docs.databend.com/sql/sql-commands/ddl/inverted-index/refresh-inverted-index) statements. Bloom filter style pruning for MAP/VARIANT columns is built in, so you do not need to create standalone bloom filter or minmax indexes. Remember to refresh a newly created inverted index so historical data becomes searchable, e.g.:\n\n```sql\nREFRESH INVERTED INDEX logs_line_idx ON logs;\n```\n\n## Metadata lookup\n\nThe adapter validates table shape with:\n\n```sql\nSELECT name, data_type\nFROM system.columns\nWHERE database = '\u003cdatabase\u003e'\n  AND table = '\u003ctable\u003e'\nORDER BY name;\n```\n\nEnsure the table matches one of the schemas above (including indexes) so Grafana can issue LogQL queries directly against Databend through this adapter.\n\n## HTTP API\n\nAll endpoints return Loki-compatible JSON responses and reuse the same error shape that Loki expects (`status:error`, `errorType`, `error`). Grafana can therefore talk to the adapter using the stock Loki data source without any proxy layers or plugins. Refer to the upstream [Loki HTTP API reference](https://grafana.com/docs/loki/latest/reference/loki-http-api/) for the detailed contract of each endpoint.\n\n| Endpoint                                | Description                                                                                                                                                                                                                                                                                                                                                                                 |\n| --------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |\n| `GET /loki/api/v1/query`                | Instant query. Supports the same LogQL used by Grafana's Explore panel. An optional `time` parameter (nanoseconds) defaults to \"now\", and the adapter automatically looks back 5 minutes when computing SQL bounds.                                                                                                                                                                         |\n| `GET /loki/api/v1/query_range`          | Range query. Accepts `start`/`end` (default past hour), `since` (relative duration), `limit`, `interval`, `step`, and `direction`. Log queries stream raw lines (`interval` down-samples entries, `direction` controls scan order); metric queries return Loki matrix results and require a `step` value (the adapter may clamp it to keep bucket counts bounded, default cap 240 buckets). |\n| `GET /loki/api/v1/labels`               | Lists known label keys for the selected schema. Optional `start`/`end` parameters (nanoseconds) fence the search window; unspecified values default to the last 5 minutes, matching Grafana's Explore defaults.                                                                                                                                                                             |\n| `GET /loki/api/v1/label/{label}/values` | Lists distinct values for a specific label key using the same optional `start`/`end` bounds as `/labels`. Works for both `loki` and `flat` schemas and automatically filters out empty strings.                                                                                                                                                                                             |\n| `GET /loki/api/v1/index/stats`          | Returns approximate `streams`, `chunks`, `entries`, and `bytes` counters for a selector over a `[start, end]` window. `chunks` are estimated via unique stream keys because Databend does not store Loki chunks.                                                                                                                                                                            |\n| `GET /loki/api/v1/tail`                 | WebSocket tail endpoint that streams live logs for a LogQL query; compatible with Grafana Explore and `logcli --tail`.                                                                                                                                                                                                                                                                      |\n\n`/query` and `/query_range` share the same LogQL parser and SQL builder. Instant queries fall back to `DEFAULT_LOOKBACK_NS` (5 minutes) when no explicit window is supplied, while range queries default to `[now - 1h, now]` and also honor Loki's `since` helper to derive `start`. `/loki/api/v1/query_range` log queries fully implement Loki's `direction` (`forward`/`backward`) and `interval` parameters: the adapter scans in the requested direction, emits entries in that order, and down-samples each stream so successive log lines are at least `interval` apart starting from `start`. `/labels` and `/label/{label}/values` delegate to schema-aware metadata lookups: the loki schema uses `map_keys`/`labels['key']` expressions, whereas the flat schema issues `SELECT DISTINCT` on the physical column and returns values in sorted order.\n\n### Tail streaming\n\n`/loki/api/v1/tail` upgrades to a WebSocket connection and sends frames that match Loki's native shape (`{\"streams\":[...],\"dropped_entries\":[]}`). Supported query parameters:\n\n- `query`: required LogQL selector.\n- `limit`: max number of entries per batch (default 100, still subject to the global `MAX_LIMIT`).\n- `start`: initial cursor in nanoseconds, defaults to \"one hour ago\".\n- `delay_for`: optional delay (seconds) that lets slow writers catch up; defaults to `0` and cannot exceed `5`.\n\nThe adapter keeps a cursor and duplicate fingerprints so new rows are streamed in chronological order without repeats. Grafana Explore, `logcli --tail`, or any WebSocket client can connect directly.\n\n### Metric queries\n\nThe adapter currently supports a narrow LogQL metric surface area:\n\n- Range functions: `count_over_time` and `rate`. The latter reports per-second values (`COUNT / window_seconds`).\n- Optional outer aggregations: `sum`, `avg`, `min`, `max`, `count`, each with `by (...)`. `without` or other modifiers return `errorType:bad_data`.\n- Pipelines: only `drop` stages are honored (labels are removed after aggregation to match Loki semantics). Any other stage still results in `errorType:bad_data`.\n- `/loki/api/v1/query_range` metric calls must provide `step`. When the requested `(end - start) / step` would exceed the configured bucket cap (default 240, tweak via `--max-metric-buckets`), the adapter automatically increases the effective step to keep the SQL result size manageable; the adapter never fans out multiple queries or aggregates in memory.\n- `/loki/api/v1/query` metric calls reuse the same expressions but evaluate them over `[time - range, time]`.\n\nBoth schema adapters (loki VARIANT labels and flat wide tables) translate the metric expression into one SQL statement that joins generated buckets with the raw rows via `generate_series`, so all aggregation happens inside Databend. Non-metric queries continue to stream raw logs.\n\n## LogQL template functions\n\n`line_format` and `label_format` now ship with a lightweight template engine that supports field interpolation (`{{ .message }}`) plus the full set of [Grafana Loki template string functions](https://grafana.com/docs/loki/latest/query/template_functions/). Supported functions are listed below:\n\n| Function                                                          | Status | Notes                                                                                   |\n| ----------------------------------------------------------------- | ------ | --------------------------------------------------------------------------------------- |\n| `__line__`, `__timestamp__`, `now`                                | ✅     | Expose the raw line, the row timestamp, and the adapter host's current time.            |\n| `date`, `toDate`, `toDateInZone`                                  | ✅     | Go-style datetime formatting and parsing (supports IANA zones).                         |\n| `duration`, `duration_seconds`                                    | ✅     | Parse Go duration strings into seconds (positive/negative).                             |\n| `unixEpoch`, `unixEpochMillis`, `unixEpochNanos`, `unixToTime`    | ✅     | Unix timestamp helpers.                                                                 |\n| `alignLeft`, `alignRight`                                         | ✅     | Align field contents to a fixed width.                                                  |\n| `b64enc`, `b64dec`                                                | ✅     | Base64 encode/decode a field or literal.                                                |\n| `bytes`                                                           | ✅     | Parses human-readable byte strings (e.g. `2 KB` → `2000`).                              |\n| `default`                                                         | ✅     | Provides a fallback when a field is empty or missing.                                   |\n| `fromJson`                                                        | ⚠️     | Validates and normalizes JSON strings (advanced loops like `range` remain unsupported). |\n| `indent`, `nindent`                                               | ✅     | Indent multi-line strings.                                                              |\n| `lower`, `upper`, `title`                                         | ✅     | Case conversion helpers.                                                                |\n| `repeat`                                                          | ✅     | String repetition helper.                                                               |\n| `printf`                                                          | ✅     | Supports `%s`, `%d`, `%f`, width/precision flags.                                       |\n| `replace`, `substr`, `trunc`                                      | ✅     | String replacement, slicing, and truncation.                                            |\n| `trim`, `trimAll`, `trimPrefix`, `trimSuffix`                     | ✅     | String trimming helpers.                                                                |\n| `urlencode`, `urldecode`                                          | ✅     | URL encoding/decoding.                                                                  |\n| `contains`, `eq`, `hasPrefix`, `hasSuffix`                        | ✅     | Logical helpers for comparisons.                                                        |\n| `int`, `float64`                                                  | ✅     | Cast values to integers/floats.                                                         |\n| `add`, `addf`, `sub`, `subf`, `mul`, `mulf`, `div`, `divf`, `mod` | ✅     | Integer and floating-point arithmetic.                                                  |\n| `ceil`, `floor`, `round`                                          | ✅     | Floating-point rounding helpers.                                                        |\n| `max`, `min`, `maxf`, `minf`                                      | ✅     | Extremum helpers for integers/floats.                                                   |\n| `count`                                                           | ✅     | Count regex matches (`{{ **line** count \"foo\" }}`).                                     |\n| `regexReplaceAll`, `regexReplaceAllLiteral`                       | ✅     | Regex replacement helpers (literal and capture-aware).                                  |\n\n`fromJson` currently only validates and re-serializes JSON strings because the template engine has no looping constructs yet. For advanced constructs (e.g., `range`), preprocess data upstream or continue to rely on Grafana/Loki-native features until control flow support arrives.\n\n## Logging\n\nBy default the adapter configures `env_logger` with `databend_loki_adapter` at `info` level and every other module at `warn`. This keeps the startup flow visible without flooding the console with dependency logs. To override the levels, set `RUST_LOG` just like any other `env_logger` application, e.g.:\n\n```bash\nexport RUST_LOG=databend_loki_adapter=debug,databend_driver=info\n```\n\n## Testing\n\nRun the Rust test suite with `cargo nextest run`.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatabendlabs%2Fdatabend-loki-adapter","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdatabendlabs%2Fdatabend-loki-adapter","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatabendlabs%2Fdatabend-loki-adapter/lists"}