https://github.com/lynxbase/lynxdb
A lightweight schema-on-read analytics in a single binary
https://github.com/lynxbase/lynxdb
analytics database devops golang logging splunk
Last synced: 4 days ago
JSON representation
A lightweight schema-on-read analytics in a single binary
- Host: GitHub
- URL: https://github.com/lynxbase/lynxdb
- Owner: lynxbase
- License: apache-2.0
- Created: 2026-03-03T12:48:10.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2026-04-19T15:23:39.000Z (8 days ago)
- Last Synced: 2026-04-19T17:15:15.646Z (8 days ago)
- Topics: analytics, database, devops, golang, logging, splunk
- Language: Go
- Homepage: https://docs.lynxdb.org/
- Size: 16.2 MB
- Stars: 13
- Watchers: 1
- Forks: 1
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Codeowners: .github/CODEOWNERS
Awesome Lists containing this project
README
# LynxDB
Log analytics in a single binary. No dependencies. Lynx Flow query language.
> LynxDB is in active development and **not yet production-ready**. APIs, storage format, and query behavior may change without notice between releases. Feedback and contributions are welcome
## Lynx Flow
Lynx Flow is LynxDB's query language - a pipeline language where data flows left-to-right through commands separated by `|`. Commands are named for what they do: `parse`, `let`, `where`, `group`, `order by`, `take`.
```
from nginx
| parse combined(_raw)
| status >= 500
| group by uri compute count() as hits, avg(duration_ms) as latency
| order by hits desc
| take 10
```
## Quick start
```bash
curl -fsSL https://lynxdb.org/install.sh | sh
```
Pipe logs through lynxdb - no server, no config:
```bash
# From raw logs to p99 latency in one line
kubectl logs deploy/api | lynxdb query '
| group by endpoint compute avg(duration_ms), perc99(duration_ms)'
# Three nested formats, one pipeline, zero config
docker logs api-server 2>&1 | lynxdb query '
| parse docker(_raw)
| parse json(message)
| explode errors
| group by errors.code, errors.service compute count() as cnt
| order by cnt desc | take 10'
# Wildcard array extraction - like jq, but with aggregation:
cat orders.json | lynxdb query '
| json items[*].price AS price, items[*].product AS product
| explode product, price
| let revenue = price * qty
| group by product compute sum(revenue) as total_revenue
| order by total_revenue desc''
```
Or run as a persistent server:
```bash
lynxdb server
lynxdb ingest nginx_access.log --source nginx_access --index balancer --batch-size 100000
lynxdb query '
| parse combined(_raw)
| method="POST" AND status < 300
| parse json(request_body)
| json items[*].sku AS skus
| explode skus
| group by skus compute count() as purchases, dc(client_ip) as unique_buyers
| order by purchases desc
| take 20'
```
Generate sample data and explore:
```bash
# Start the demo (streams realistic logs from 4 sources at 200 events/sec)
lynxdb demo
# Try in another terminal:
lynxdb query 'from nginx | group by status compute count()'
lynxdb query '| level="ERROR" | group by host compute count()' --since 5m
lynxdb tail 'level=ERROR'
```
## Features
- **Pipe mode** - reads from stdin or files, works like `grep`. No server, no config.
- **Lynx Flow** - `group`, `let`, `parse`, `order by`, `join`, CTEs, domain sugar, and [more](https://docs.lynxdb.org/docs/lynx-flow/overview). Partial SPL2 compatibility.
- **Full-text search** - FST inverted index + roaring bitmaps, bloom filters for segment skipping
- **Columnar storage** - custom `.lsg` format, delta-varint timestamps, dictionary encoding, Gorilla XOR, LZ4
- **Materialized views** - precomputed aggregations with automatic query rewrite, up to ~400x speedup
- **Cluster mode** - add `--cluster.seeds` to go distributed; S3-backed shared storage
- **Drop-in ingestion** - Elasticsearch `_bulk`, OpenTelemetry OTLP, Splunk HEC
## Comparison
| | lynxdb | Splunk | Elasticsearch | Loki |
|-------------------|------------------|---------------|---------------|--------------------|
| Deployment | Single binary | Standalone | Cluster | Single binary |
| Dependencies | None | -- | JVM | Object storage |
| Query language | Lynx Flow / SPL2 | SPL | Lucene/ES\|QL | LogQL |
| Pipe mode | Yes | -- | -- | -- |
| Full-text index | FST + bitmaps | tsidx | Lucene | Label index only |
| Memory (idle) | ~50 MB | ~12 GB | ~1 GB+ | ~256 MB |
| License | Apache 2.0 | Commercial | ELv2 / AGPL | AGPL |
## Configuration
Zero config needed - sensible defaults for everything. Customize in `~/.config/lynxdb/config.yaml`:
```yaml
listen: "0.0.0.0:3100"
data_dir: "/data/lynxdb"
retention: 30d
storage:
compression: lz4
cache_max_bytes: 4gb
```
Cascade: CLI flags -> `LYNXDB_*` env vars -> config file -> defaults.
Full configuration reference
```yaml
listen: "0.0.0.0:3100"
data_dir: "/data/lynxdb"
retention: 30d
storage:
compression: lz4 # lz4 | zstd
flush_threshold: 512mb
cache_max_bytes: 4gb
s3_bucket: my-logs-bucket
s3_region: us-east-1
query:
max_concurrent: 20
max_query_runtime: 10m
```
## CLI reference
```
lynxdb server start server
lynxdb query run a query (Lynx Flow or SPL2)
lynxdb tail live tail
lynxdb ingest ingest a file
lynxdb shell interactive REPL with completion
lynxdb count quick event count
lynxdb sample N peek at data shape
lynxdb watch -i 5s periodic refresh with deltas
lynxdb diff -p 1h this period vs previous period
lynxdb explain query plan without executing
lynxdb mv create/list materialized views
lynxdb status server metrics
lynxdb bench benchmark
lynxdb demo generate sample data
```
## Build from source
```bash
git clone https://github.com/lynxbase/lynxdb && cd lynxdb
go build -o lynxdb ./cmd/lynxdb/
go test ./...
```
## Contributing
See [CONTRIBUTING.md](CONTRIBUTING.md)
## Feedback
- [Issues](https://github.com/lynxbase/lynxdb/issues)
---
LynxDB wouldn't exist without the projects that inspired it:
- **[Splunk](https://www.splunk.com/)** - for creating SPL, the most expressive log query language. LynxDB's SPL2 compatibility and Lynx Flow design owe everything to Splunk's query model.
- **[ClickHouse](https://clickhouse.com/)** - for proving that a single-binary analytical database with incredible performance is possible. The MergeTree architecture deeply influenced LynxDB's storage engine design.
- **[VictoriaLogs](https://docs.victoriametrics.com/victorialogs/)** - for showing that log analytics can be resource-efficient and operationally simple.
- **`grep`, `awk`, `sed`** - for the Unix philosophy of composable tools and piping. LynxDB's pipe mode is a direct homage to this tradition.
This project started in early 2025 out of a deep appreciation for these tools and a desire to bring Splunk-level analytics to everyone in a single, lightweight binary.
## License
[Apache 2.0](LICENSE)