{"id":49112992,"url":"https://github.com/false-systems/jalki","last_synced_at":"2026-04-21T05:37:49.656Z","repository":{"id":350921461,"uuid":"1208177838","full_name":"false-systems/jalki","owner":"false-systems","description":"Programmable eBPF fentry/fexit tracing framework for Linux. Hook any kernel function with one Rust trait — structured JSON events out. TCP connects, retransmits, closes, and any   function you define.","archived":false,"fork":false,"pushed_at":"2026-04-12T19:31:38.000Z","size":43114,"stargazers_count":1,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-12T20:31:02.153Z","etag":null,"topics":["aya","ebpf","fentry","fexit","kubernetes","linux-kernel","rust"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/false-systems.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-11T23:30:00.000Z","updated_at":"2026-04-12T17:17:09.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/false-systems/jalki","commit_stats":null,"previous_names":["false-systems/jalki"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/false-systems/jalki","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/false-systems%2Fjalki","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/false-systems%2Fjalki/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/false-systems%2Fjalki/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/false-systems%2Fjalki/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/false-systems","download_url":"https://codeload.github.com/false-systems/jalki/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/false-systems%2Fjalki/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32079470,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-21T02:38:07.213Z","status":"ssl_error","status_checked_at":"2026-04-21T02:38:06.559Z","response_time":128,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aya","ebpf","fentry","fexit","kubernetes","linux-kernel","rust"],"created_at":"2026-04-21T05:37:49.085Z","updated_at":"2026-04-21T05:37:49.647Z","avatar_url":"https://github.com/false-systems.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# jälki\n\n**The kernel knows what's wrong. jälki lets you ask it.**\n\n---\n\nToday, asking the Linux kernel \"why is this connection slow?\" requires eBPF expertise that maybe a few hundred people in the world have. You need to know BTF, aya, ring buffers, CO-RE, the BPF verifier, kernel struct offsets, and how to interpret raw tracing data. It's a week of work before you see a single structured event.\n\njälki removes that barrier. You ask a question. jälki hooks the right kernel function, collects the events, and interprets them:\n\n```\n❯ jalki ask \"why is postgres slow\"\n\nProbes selected:\n  tcp_connect (fexit/kernel.tcp.connect)\n  tcp_retransmit_skb (fentry/kernel.tcp.retransmit)\n  attached tcp_connect → probe_001\n  attached tcp_retransmit_skb → probe_002\nCollecting events for 5s...\nCollected 47 events. Interpreting...\n\n# Question: why is postgres slow\n\n## Events observed (47 total in 5s)\n  jalki/tcp_connect: 12 events\n  jalki/tcp_retransmit: 35 events\n\n## Interpretation\n\n**tcp_retransmit_skb** (warning)\n\n  packets are being lost on an active connection — network congestion,\n  switch issue, or physical layer problem between nodes\n\n  Action: check network path between 10.42.1.15 and 10.42.2.8.\n  this is a network problem, not application.\n```\n\nThe kernel knew the answer all along. 35 retransmits in ESTABLISHED state on the path to Postgres. Network problem, not application. jälki just made that knowledge accessible.\n\n---\n\n## Why this matters\n\n**For humans**: Network debugging is dark magic. When connections are slow, you guess. You restart things. You blame the application. The kernel has the answer — retransmit counts, TCP states, connection errnos — but that data is locked behind eBPF expertise. jälki unlocks it with a single command.\n\n**For AI agents**: An agent debugging a production issue can now ask the kernel directly. No human eBPF expertise in the loop. The agent identifies the right kernel function, deploys a probe, reads structured events, and reasons about root cause. This is the foundation for autonomous infrastructure debugging.\n\n**For the eBPF ecosystem**: Writing a new fentry/fexit probe is one Rust trait. jälki handles BTF loading, program attachment, ring buffer management, self-filtering, sampling, serialization, and emission. The framework does the hard parts so you can focus on what to observe and how to interpret it.\n\n---\n\n## How it works\n\n```\n                    kernel space\n   ┌────────────────────────────────────────────────┐\n   │  tcp_connect()      → fexit  → eBPF program ──┐│\n   │  tcp_close()        → fexit  → eBPF program   ││\n   │  tcp_retransmit_skb → fentry → eBPF program   ││\n   │                                                ││\n   │  PID_FILTER: skip jälki's own syscalls         ││\n   │  per-probe ring buffers (4MB each) ◄───────────┘│\n   └────────────────────┬───────────────────────────┘\n                        │\n                    userspace\n   ┌────────────────────▼───────────────────────────┐\n   │  jälki daemon                                   │\n   │                                                 │\n   │  loader     → attach probes via BTF metadata    │\n   │  reader     → drain ring buffers → EventStore   │\n   │  probes     → raw bytes → FALSE Protocol JSON   │\n   │  emitters   → stdout / file / gRPC              │\n   │  IPC server → /run/jalki/jalki.sock             │\n   │  metrics    → Prometheus :9090                  │\n   └────────────────────┬───────────────────────────┘\n                        │\n   ┌────────────────────▼───────────────────────────┐\n   │  CLI / MCP / agents                             │\n   │                                                 │\n   │  jalki ask   → question → probes → interpret    │\n   │  jalki watch → collect events from one probe    │\n   │  jalki-mcp   → AI agent tool interface          │\n   └────────────────────────────────────────────────┘\n```\n\n**fentry/fexit** — BPF trampolines, not kprobes. Near-zero overhead. Safe for production 24/7.\n\n**CO-RE** — Compile Once, Run Everywhere. One binary, any kernel 5.5+ with BTF.\n\n**Self-filter** — jälki's own PID is excluded in kernel space. No feedback loops.\n\n---\n\n## Quick start\n\n```bash\n# Build\ncargo run -p xtask -- build-ebpf --release\ncargo build --release -p jalki\n\n# Terminal 1: start the daemon (needs root for eBPF)\nsudo ./target/release/jalki --emit stdout --cluster dev\n\n# Terminal 2: ask a question\n./target/release/jalki ask \"why are connections failing\"\n\n# Or explore\n./target/release/jalki list --layer tcp\n./target/release/jalki status\n./target/release/jalki watch tcp_connect --seconds 10\n./target/release/jalki stream tcp_retransmit_skb\n```\n\n`jalki ask` works without a daemon too — it falls back to a knowledge base analysis showing which probes to deploy and what to look for.\n\n---\n\n## CLI\n\n| Command | What it does |\n|---------|-------------|\n| `jalki` (no subcommand) | Daemon mode — load eBPF, attach probes, emit events |\n| `jalki ask \"question\"` | KB search → auto-deploy → collect → interpret → answer |\n| `jalki watch \u003cfunction\u003e` | Deploy probe, collect for N seconds, print events |\n| `jalki stream [function]` | Live ndjson event stream |\n| `jalki list [--layer tcp]` | Browse the knowledge base |\n| `jalki status` | Show attached probes, event counts, drops |\n\n---\n\n## What you get\n\nEvery kernel function you care about becomes a structured event:\n\n```json\n{\n  \"source\": \"jalki/tcp_retransmit\",\n  \"type\": \"kernel.tcp.retransmit\",\n  \"severity\": \"warning\",\n  \"correlation_keys\": [\"10.42.1.15:48210-\u003e10.42.2.8:5432\"],\n  \"network_data\": {\n    \"src_ip\": \"10.42.1.15\",\n    \"dst_ip\": \"10.42.2.8\",\n    \"dst_port\": 5432,\n    \"protocol\": \"tcp\"\n  },\n  \"process_data\": {\n    \"pid\": 1847,\n    \"command\": \"api-server\"\n  }\n}\n```\n\nYour API server is retransmitting to Postgres. The kernel knows this. Now you know it too.\n\n---\n\n## The knowledge base\n\njälki ships a built-in knowledge base of kernel functions — which function to hook for a given question, what fields matter, and how to interpret the events.\n\nThe TCP state field on `tcp_retransmit_skb` is the most important signal:\n\n| State | Value | What it means |\n|-------|-------|---------------|\n| SYN_SENT | 2 | Handshake failing — remote unreachable, firewall, host down |\n| ESTABLISHED | 1 | Active connection losing packets — network congestion |\n| CLOSE_WAIT | 7 | Application hung, not reading from socket |\n\n**SYN_SENT retransmit = not an application problem.** The connection never established.\n\n**ESTABLISHED retransmit = network problem, not application.** The packets are being lost in transit.\n\nDifferent problems, different fixes. The kernel knows which one it is.\n\n---\n\n## The framework\n\njälki is a framework, not just a tool. The three TCP probes are batteries-included. Adding your own probe is one trait:\n\n```rust\nimpl Probe for MyProbe {\n    fn name(\u0026self) -\u003e \u0026str { \"my_probe\" }\n    fn program_name(\u0026self) -\u003e \u0026str { \"jalki_my_probe\" }\n    fn attachments(\u0026self) -\u003e \u0026[Attachment] {\n        \u0026[Attachment::Fentry { function: \"some_kernel_function\" }]\n    }\n    fn ring_buffer_map(\u0026self) -\u003e \u0026str { \"MY_EVENTS\" }\n    fn to_occurrence(\u0026self, raw: \u0026[u8], cluster: \u0026str) -\u003e Result\u003cOccurrence, ProbeError\u003e {\n        // convert raw ring buffer bytes to a FALSE Protocol Occurrence\n    }\n}\n```\n\njälki handles eBPF loading, BTF attachment, ring buffer management, self-filtering, sampling, batching, and emission. You describe what to observe and how to interpret it. The framework does the rest.\n\n---\n\n## MCP server\n\n`jalki-mcp` exposes kernel observability to AI agents via the Model Context Protocol:\n\n```\njalki_find_probe(\"why are connections slow\")  → tcp_retransmit_skb, tcp_connect\njalki_deploy_probe(\"tcp_retransmit_skb\")      → probe_001\njalki_get_events(\"probe_001\", filter={...})   → [Occurrence, ...]\njalki_explain_event(function, tcp_state=1)    → \"network problem, not application\"\njalki_probe_status()                          → attached probes + counts\n```\n\nAn agent asks the knowledge base before guessing. Deploys probes. Reads events. Gets interpretations. No eBPF expertise required.\n\n---\n\n## Python SDK\n\nFor agents and rapid iteration. `pip install jalki` and ask the kernel from Python:\n\n```python\nimport jalki\n\n# one call: find → deploy → collect → interpret\nresult = await jalki.ask(\"why are connections failing\")\nprint(result.interpretation, result.action)\n\n# or control each step\nmatches = jalki.find(\"packet loss\")                         # local KB, no daemon\nhandle  = await jalki.deploy(\"tcp_retransmit_skb\")          # attach probe\nasync for event in jalki.stream(handle, interpreted=True):  # live events\n    print(event.net.dst, event.severity, event.interp)\n```\n\n`find()` works offline — the knowledge base ships in the wheel. `ask()` falls back to KB-only analysis when no daemon is running, so it never raises.\n\n---\n\n## Built-in probes\n\n| Probe | Hook | What it gives you |\n|-------|------|-------------------|\n| `TcpConnect` | `fexit/tcp_connect` | Connection attempts — 4-tuple, success/failure, errno |\n| `TcpClose` | `fexit/tcp_close` | Connection teardown — 4-tuple, process info |\n| `TcpRetransmit` | `fentry/tcp_retransmit_skb` | Retransmissions — 4-tuple, TCP state |\n\nThese three, joined on the 4-tuple, answer: which backends are being connected to, which connections are failing, which are retransmitting, and what the TCP state was when it happened.\n\n---\n\n## Kubernetes\n\nHelm chart in `helm/jalki/`. Deploys as a DaemonSet with `hostPID`, `hostNetwork`, and privileged access for eBPF.\n\n```bash\nhelm install jalki helm/jalki/ --set cluster=prod-east-1 --set emit=stdout\n```\n\n---\n\n## Requirements\n\n- Linux kernel 5.5+ x86, 6.0+ ARM64\n- `CONFIG_DEBUG_INFO_BTF=y`, `CONFIG_BPF_JIT=y`\n- BTF at `/sys/kernel/btf/vmlinux`\n- Root or `CAP_BPF` + `CAP_PERFMON`\n\n---\n\n## Testing\n\njälki uses requirement-based testing. Specs define what must be true. The oracle validates it.\n\n```\nspecs/                          ← requirements (natural language markdown)\n  protocol/find.md                \"find must return tcp_connect for connection questions\"\n  protocol/ask.md                 \"ESTABLISHED retransmit must say network problem\"\n  knowledge/knowledge-base.md     \"at least 20 probes across 5 layers\"\n       │\n       │  each requirement maps to an oracle test case\n       ▼\neval/oracle/                    ← standalone Rust binary, reads JSON from disk\n  case_014_retransmit_established_says_network_problem\n  case_080_econnrefused_says_not_listening\n  case_060_at_least_20_probes\n```\n\nThe oracle never imports jälki code. It reads knowledge base JSON and generated SDK files, then asserts they match the spec. When a case fails, fix the system — not the test.\n\n```bash\n# Run all 50 oracle cases\ncargo test --manifest-path eval/oracle/Cargo.toml\n\n# Run workspace tests (probes, codegen, store, SDK meta)\ncargo test --workspace\n\n# Python SDK conformance (no daemon needed)\ncd jalki-sdk-python \u0026\u0026 .venv/bin/pytest tests/ -m \"not daemon\"\n```\n\n---\n\n## Known limitations\n\n- **dst_ip 0.0.0.0 on Cilium-managed connections** — `skc_daddr` reads 0 when Cilium drops the packet before destination resolution (policy denial), when the conntrack table has no entry for the connection, or during loopback SNAT where the address is temporarily 0.0.0.0. Not fixable from jälki — requires Cilium debug monitor logs (`cilium monitor --type drop`) to diagnose the specific cause.\n- **src_port 0 on tcp_close events** — the kernel clears `skc_num` before `tcp_close` returns, so fexit sees 0. This is correct kernel behavior. Use the `tcp_connect` event's `src_port` and correlate by 4-tuple to get the full picture.\n- **IPv4 only** — IPv6 in v0.2.\n- **bytes_sent/bytes_received emit 0** — requires `tcp_sock` offset walking not yet implemented.\n- **gRPC emitter is a stub** — use stdout or file.\n- **Privileged required** — `CAP_BPF` + `CAP_PERFMON` at minimum.\n\n---\n\n## Part of False Systems\n\n```\njälki     kernel observation (this)\nTAPIO     k8s observation\nRAUTA     L7 gateway\nPOLKU     event transport\nAHTI      causality correlation\nsyva      enforcement\nrauha     container runtime\n```\n\njälki is the deepest layer. It sees what the kernel sees.\n\n---\n\n\u003e *jälki* (Finnish) — footprint, trace, track.\n\n*false systems · berlin · 2026 · apache 2.0*\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffalse-systems%2Fjalki","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffalse-systems%2Fjalki","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffalse-systems%2Fjalki/lists"}