An open API service indexing awesome lists of open source software.

https://github.com/puneethkumarck/prism

Prism โ€” High-performance real-time Solana transaction indexer built with Java 25, Helidon 4 SE, Virtual Threads, pgjdbc COPY protocol, and hexagonal architecture. Streams via Yellowstone gRPC or free WebSocket. No Spring Boot.
https://github.com/puneethkumarck/prism

blockchain grpc helidon hexagonal-architecture indexer java postgresql real-time solana virtual-threads

Last synced: 8 days ago
JSON representation

Prism โ€” High-performance real-time Solana transaction indexer built with Java 25, Helidon 4 SE, Virtual Threads, pgjdbc COPY protocol, and hexagonal architecture. Streams via Yellowstone gRPC or free WebSocket. No Spring Boot.

Awesome Lists containing this project

README

          

![Build](https://github.com/Puneethkumarck/prism/actions/workflows/ci.yml/badge.svg)
![Java 25](https://img.shields.io/badge/Java-25_LTS-ED8B00?style=for-the-badge&logo=openjdk&logoColor=white)
![Helidon](https://img.shields.io/badge/Helidon-4.4_SE-00569C?style=for-the-badge&logo=oracle&logoColor=white)
![Virtual Threads](https://img.shields.io/badge/Virtual_Threads-Project_Loom-7B1FA2?style=for-the-badge&logo=java&logoColor=white)
![PostgreSQL](https://img.shields.io/badge/PostgreSQL-16-4169E1?style=for-the-badge&logo=postgresql&logoColor=white)
![Solana](https://img.shields.io/badge/Solana-Mainnet-9945FF?style=for-the-badge&logo=solana&logoColor=white)
![Architecture](https://img.shields.io/badge/Architecture-Hexagonal-purple?style=for-the-badge)
![No Spring Boot](https://img.shields.io/badge/No-Spring_Boot-red?style=for-the-badge)

# ๐Ÿ”บ Prism

### Refract the Solana firehose into a queryable data stream.

**A zero-Spring, zero-JPA, real-time Solana transaction indexer built on Java 25 Virtual Threads and Helidon 4 SE.**
Streams confirmed transactions via Yellowstone gRPC (paid) or WebSocket `blockSubscribe` (free), persists them with the PostgreSQL `COPY` protocol, and serves them over a paginated REST API โ€” all with sub-100ms startup and a <50 MB resident footprint.

[Why Prism?](#-why-does-this-exist) ยท [Architecture](#-architecture) ยท [The Hot Path](#-the-hot-path-how-a-transaction-becomes-a-row) ยท [Quick Start](#-quick-start) ยท [API Reference](#-api-reference) ยท [Tech Stack](#-tech-stack)

---

## ๐ŸŽฌ The Problem

Solana produces a new block roughly every **400 milliseconds**. At any given moment, mainnet can push **50,000+ transactions per second** through the firehose. If you want to know what happened on chain โ€” payments, memos, failed swaps, large transfers โ€” you have exactly three options:

| Option | What Happens | ๐Ÿ’€ Verdict |
|---|---|---|
| ๐Ÿข **Poll `getBlock`** | `getBlock(slot)` โ†’ repeat, repeat, repeat | Falls behind in minutes. Burns RPC credits. Dies at 50K TPS. |
| ๐Ÿ—๏ธ **Use a hosted indexer** | Pay per query. Sit behind someone else's cache. Hope their schema fits | $$$, schema lock-in, no custom parsing |
| ๐Ÿš€ **Stream + batch yourself** | Subscribe to Geyser/WebSocket, parse in-process, batch write to Postgres | Full control. Sub-second freshness. Your schema, your queries. |

Prism is option 3 โ€” a sharp, opinionated take on **option 3** โ€” written in Java 25 with Virtual Threads, Helidon 4 SE, and raw pgjdbc. No Spring Boot. No JPA. No reflection. Just a tight hot path from socket to row.

## ๐Ÿ’ก The Solution

```text
๐ŸŒŠ Solana ๐Ÿ”บ Prism ๐Ÿ—„๏ธ PostgreSQL
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ โ”€โ”€โ”€โ”€โ”€โ”€ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
transactions
Yellowstone gRPC โ”€โ”€โ–บ stream adapter โ”€โ”€โ–บ COPY โ”€โ”€โ–บ failed_tx
(or WebSocket) โ”‚ โ”€โ–บ memos
โ–ผ โ”€โ–บ large_transfers
LinkedTransferQueue โ”€โ–บ accounts
(unbounded, lock-free) โ–ฒ
โ”‚ โ”‚
โ–ผ โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
TransactionBatchService parallel writes
(200 tx / 100ms dual-trigger) on virtual threads
โ”‚
โ–ผ
TransactionProcessor
split โ†’ [success | failed | memo | transfer]
```

## ๐ŸŽฏ The Result

A real-time pipeline that stays caught up with mainnet, survives RPC flaps, flushes batches in ~100ms, and exposes 8 paginated read endpoints โ€” all inside a JVM that starts in under a second and sips heap.

| Metric | Value |
|--------|-------|
| **Throughput target** | ~99.5% indexing efficiency at mainnet velocity |
| **Write strategy** | PostgreSQL `COPY FROM STDIN` + staging merge โ€” **5-10ร— faster** than `INSERT VALUES` |
| **Startup time** | < 100 ms (Helidon 4 SE, no classpath scanning) |
| **Thread model** | Virtual Threads โ€” no reactive `.flatMap().subscribeOn()` gymnastics |
| **Backpressure** | Unbounded tx queue, bounded account queue โ€” zero disconnects from Yellowstone |
| **Streaming modes** | ๐Ÿ†“ WebSocket `blockSubscribe` (free) ยท ๐Ÿ’ฐ Yellowstone gRPC (paid) |
| **Stack size** | Helidon 4 SE + pgjdbc + Avaje Inject + Micrometer โ€” **NO Spring Boot** |

---

## ๐Ÿ“š Table of Contents

- [๐Ÿค” Why Does This Exist?](#-why-does-this-exist)
- [๐ŸŒˆ Why "Prism"?](#-why-prism)
- [๐ŸŽž๏ธ A Day in the Life of a Solana Transaction](#%EF%B8%8F-a-day-in-the-life-of-a-solana-transaction)
- [โšก The Hot Path: How a Transaction Becomes a Row](#-the-hot-path-how-a-transaction-becomes-a-row)
- [๐Ÿ›๏ธ Architecture](#%EF%B8%8F-architecture)
- [๐Ÿงฌ The COPY Protocol: Why We Bypass `INSERT VALUES`](#-the-copy-protocol-why-we-bypass-insert-values)
- [๐Ÿงต Virtual Threads: Why No Reactor, No WebFlux, No Spring Boot](#-virtual-threads-why-no-reactor-no-webflux-no-spring-boot)
- [๐ŸŒŠ Dual Streaming Modes: Free or Fast](#-dual-streaming-modes-free-or-fast)
- [๐Ÿชฃ Dual-Trigger Batching: Size OR Time](#-dual-trigger-batching-size-or-time)
- [๐Ÿ” The Reconnect Dance](#-the-reconnect-dance)
- [๐Ÿ› ๏ธ Tech Stack](#%EF%B8%8F-tech-stack)
- [๐Ÿงฑ Module Structure](#-module-structure)
- [๐Ÿš€ Quick Start](#-quick-start)
- [๐ŸŽ›๏ธ Make Targets](#%EF%B8%8F-make-targets)
- [๐ŸŒ API Reference](#-api-reference)
- [โš™๏ธ Configuration Reference](#%EF%B8%8F-configuration-reference)
- [๐Ÿ“Š Observability](#-observability)
- [๐Ÿงช Testing Strategy](#-testing-strategy)
- [๐Ÿ—‚๏ธ Database Schema](#%EF%B8%8F-database-schema)
- [๐Ÿง  Design Decisions, Quick Reference](#-design-decisions-quick-reference)
- [๐Ÿ“œ License](#-license)

---

## ๐Ÿค” Why Does This Exist?

Because every on-chain product eventually asks the same questions:

- ๐Ÿ’ธ *"Did our transaction confirm?"*
- ๐Ÿ’ฐ *"Which wallets just moved more than 1 SOL?"*
- ๐Ÿ“ *"Did the customer include a memo with that payment?"*
- ๐Ÿšซ *"How many of our swaps failed in the last hour?"*
- ๐Ÿฆ *"What's the current balance of every fee payer we've seen?"*

These questions need **fresh, queryable, relational** data โ€” not JSON-RPC round-trips to a Solana RPC node, and not someone else's hosted indexer. You need *your* Postgres with *your* indexes and *your* schema, populated a few hundred milliseconds after finality.

Prism answers all five questions out of the box.

> **๐ŸŽฏ Design principle:** Hot path stays **synchronous and boring**. No reactive, no actors, no fancy schedulers. One virtual thread per job. One queue per concurrency boundary. The JVM does the rest.

---

## ๐ŸŒˆ Why "Prism"?

Because a prism takes one stream of white light and **splits it into the colors that were always there** โ€” it doesn't create information, it reveals structure.

That's exactly what the indexer does to Solana's stream:

```text
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ โ”‚
โœจ Solana stream โ”€โ”€โ”ค ๐Ÿ”บ Prism โ”‚
(undifferentiated)โ”‚ โ”‚
โ”‚ โ”Œโ”€โ–บ ๐ŸŸข successful transactions โ”‚
โ”‚ โ”œโ”€โ–บ ๐Ÿ”ด failed transactions โ”‚
โ”‚ โ”œโ”€โ–บ ๐ŸŸก large transfers (>1 SOL) โ”‚
โ”‚ โ”œโ”€โ–บ ๐ŸŸฃ memos โ”‚
โ”‚ โ””โ”€โ–บ ๐Ÿ”ต fee payer accounts โ”‚
โ”‚ โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
five queryable tables, refracted
out of one protobuf soup
```

Every incoming transaction is refracted into the projections you actually want to query. No more scanning JSON blobs. No more `getBlock` loops. Just `SELECT`.

---

## ๐ŸŽž๏ธ A Day in the Life of a Solana Transaction

> **๐ŸŽฌ Scene 1 โ€” Somewhere in a Solana validator, 400 ms ago**

```text
โš™๏ธ validator ๐Ÿ“ก Geyser plugin ๐Ÿ”บ Prism
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ โ”€โ”€โ”€โ”€โ”€โ”€โ”€

๐Ÿงฑ builds slot 312_701 โ”‚
๐Ÿ“ includes tx 5Kx7aLm... โ”‚
โœ‰๏ธ finalizes block โ”‚
๐Ÿš€ "new tx!" โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–บ ๐Ÿ“ฅ arrives on gRPC
โ”‚
๐Ÿ” TransactionParser
โ”‚
โ”‚ signature: 5Kx7a...
โ”‚ slot: 312_701
โ”‚ amount: 4.2 SOL
โ”‚ from: 7xKX...h9Fz
โ”‚ to: 9vBM...n3Tr
โ”‚ memo: "invoice #7341"
โ”‚ failed: false
โ–ผ
LinkedTransferQueue (unbounded)
โ”‚
โ–ผ
๐Ÿ“ฆ TransactionBatchService
โ”‚ (accumulating...)
โ”‚ 199 tx + this one = 200 โ†’ ๐Ÿšฝ FLUSH
โ–ผ
๐Ÿ”€ TransactionProcessor
โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ–ผ โ–ผ โ–ผ
๐Ÿ“„ transactions ๐Ÿ“ memos ๐Ÿ’ฐ large_transfers
(COPY FROM STDIN) (batch INSERT) (batch INSERT)
staging โ†’ merge reWriteBatched=true reWriteBatched=true
โ”‚
โ–ผ
done in ~8 ms
total, parallel
```

**Timeline for one transaction:**

| Stage | Latency | What happens |
|---|---|---|
| ๐Ÿ“ก Geyser publish | ~5 ms | Validator plugin flushes to wire |
| ๐Ÿšš Network hop | ~10-30 ms | HTTP/2 frame to Prism |
| ๐Ÿ” Parse | < 1 ms | Protobuf โ†’ domain record |
| ๐Ÿชฃ Queue | < 1 ฮผs | `LinkedTransferQueue.offer()` |
| ๐Ÿ“ฆ Batch wait | 0-100 ms | Dual-trigger: 200 tx OR 100 ms |
| ๐Ÿ–Š๏ธ COPY + merge | ~5-10 ms | 200 rows in one write |
| โœ… **End-to-end** | **< 200 ms** | From finality to queryable row |

> **Scene 2 โ€” A developer runs `curl localhost:3000/api/transfers?min_amount=4.0` and sees the transaction from Scene 1 in the results.** That's the whole movie.

---

## โšก The Hot Path: How a Transaction Becomes a Row

The write side is the most interesting part of the system. It's designed to be boring, fast, and **impossible to back-pressure**.

```mermaid
flowchart LR
subgraph Source["๐ŸŒŠ Solana Source"]
direction TB
YS["Yellowstone gRPC
(paid)"]
WS["WebSocket
blockSubscribe
(free)"]
end

subgraph Parse["๐Ÿ” Parsing"]
P1["TransactionParser
protobuf / json"]
P2["BlockNotificationParser
shared logic"]
end

subgraph Queue["๐Ÿชฃ Concurrency Boundary"]
direction TB
TQ["LinkedTransferQueue
unbounded
tx stream"]
AQ["ArrayBlockingQueue(10K)
bounded, drop-if-full
account stream"]
end

subgraph Batch["๐Ÿ“ฆ Batching"]
direction TB
TB["TransactionBatchService
200 tx / 100 ms"]
AB["AccountBatchService
200 acct / 2 s
+ dedup by pubkey"]
end

subgraph Processor["๐Ÿ”€ Processor"]
TP["TransactionProcessor
split into 4 buckets"]
end

subgraph DB["๐Ÿ—„๏ธ PostgreSQL"]
direction TB
T1["transactions
COPY FROM STDIN"]
T2["failed_transactions
batch INSERT"]
T3["memos
batch INSERT"]
T4["large_transfers
batch INSERT"]
T5["accounts
UPSERT ON CONFLICT"]
end

YS --> P1 --> TQ
WS --> P2 --> TQ
P1 --> AQ
P2 --> AQ
TQ --> TB --> TP
AQ --> AB --> T5
TP --> T1
TP --> T2
TP --> T3
TP --> T4

style T1 fill:#4caf50,color:#fff
style YS fill:#9945FF,color:#fff
style WS fill:#00D18C,color:#fff
```

**Two concurrency boundaries, two queue strategies:**

| Queue | Type | Capacity | Policy | Why |
|---|---|---|---|---|
| ๐Ÿชฃ **Transaction** | `LinkedTransferQueue` | **Unbounded** | Never blocks producer | If this queue blocks, Yellowstone hangs up with a `lagged` error. Losing a transaction is worse than using heap. |
| ๐Ÿชฃ **Account** | `ArrayBlockingQueue` | **10,000** | `try_offer` โ€” drop if full | Accounts are less critical and dedup-friendly. Dropping the occasional fee payer snapshot is fine. |

This asymmetry is the whole trick. Transactions get backpressure protection; accounts get memory protection. Nobody wins both fights at once.

---

## ๐Ÿ›๏ธ Architecture

Prism follows strict **hexagonal architecture (ports & adapters)** with DDD tactical patterns. Dependencies always point inward.

```text
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ โ”‚
โ”‚ ๐Ÿ›๏ธ application/ โ”‚
โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚ โ”‚ Helidon 4 SE functional routes โ”‚ โ”‚
โ”‚ โ”‚ IndexerApplication (main) โ”‚ โ”‚
โ”‚ โ”‚ IndexerConfig (env parsing) โ”‚ โ”‚
โ”‚ โ”‚ GlobalErrorHandler โ”‚ โ”‚
โ”‚ โ”‚ MapStruct mappers โ”‚ โ”‚
โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ”‚ โ”‚ โ”‚
โ”‚ โ–ผ delegates to โ”‚
โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚ โ”‚ ๐Ÿง  domain/ โ”‚ โ”‚
โ”‚ โ”‚ โ”‚ โ”‚
โ”‚ โ”‚ โ”Œโ”€โ”€โ”€โ”€ model โ”€โ”€โ”€โ”€โ” โ”‚ โ”‚
โ”‚ โ”‚ โ”‚ Signature โ”‚ โ”‚ โ”‚
โ”‚ โ”‚ โ”‚ Pubkey โ”‚ โ”‚ โ”‚
โ”‚ โ”‚ โ”‚ Slot โ”‚ โ”‚ โ”‚
โ”‚ โ”‚ โ”‚ SolanaTx โ”‚ โ”‚ โ”‚
โ”‚ โ”‚ โ”‚ Account โ”‚ โ”‚ โ”‚
โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚
โ”‚ โ”‚ โ”‚ โ”‚
โ”‚ โ”‚ โ”Œโ”€โ”€โ”€โ”€ service โ”€โ”€โ” โ”‚ โ”‚
โ”‚ โ”‚ โ”‚ BatchService โ”‚ โ”‚ โ”‚
โ”‚ โ”‚ โ”‚ Processor โ”‚ โ”‚ โ”‚
โ”‚ โ”‚ โ”‚ LargeTransfer โ”‚ โ”‚ โ”‚
โ”‚ โ”‚ โ”‚ Filter โ”‚ โ”‚ โ”‚
โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚
โ”‚ โ”‚ โ”‚ โ”‚
โ”‚ โ”‚ โ”Œโ”€โ”€โ”€โ”€ port โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ”‚
โ”‚ โ”‚ โ”‚ TxStream โ”‚โ—„โ”€โ”€ implemented by
โ”‚ โ”‚ โ”‚ TxRepo โ”‚ โ”‚ โ”‚
โ”‚ โ”‚ โ”‚ MemoRepo โ”‚ โ”‚ โ”‚
โ”‚ โ”‚ โ”‚ ... โ”‚ โ”‚ โ”‚
โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚
โ”‚ โ”‚ โ”‚ โ”‚
โ”‚ โ”‚ ZERO framework imports. โ”‚ โ”‚
โ”‚ โ”‚ Only Lombok + java.* โ”‚ โ”‚
โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ”‚ โ–ฒ โ”‚
โ”‚ โ”‚ implements ports โ”‚
โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚ โ”‚ ๐Ÿ”Œ infrastructure/ โ”‚ โ”‚
โ”‚ โ”‚ โ”‚ โ”‚
โ”‚ โ”‚ grpc/ Yellowstone โ”‚ โ”‚
โ”‚ โ”‚ websocket/ blockSubscribe โ”‚ โ”‚
โ”‚ โ”‚ persistence/ pgjdbc + COPY โ”‚ โ”‚
โ”‚ โ”‚ metrics/ Micrometer โ”‚ โ”‚
โ”‚ โ”‚ console/ ANSI formatter โ”‚ โ”‚
โ”‚ โ”‚ solana/ Base58, balance โ”‚ โ”‚
โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ”‚ โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ›ก๏ธ ArchUnit enforces these rules at build time
```

**The rules (enforced by ArchUnit):**

| Rule | What It Stops |
|---|---|
| `domain` โŠฅ `infrastructure` | Prevents domain from leaking JDBC/gRPC types |
| `domain` โŠฅ `application` | Prevents domain from reaching up into routes |
| `domain` has **zero** Helidon/Jakarta imports | Keeps domain framework-free (Lombok + `java.*` only) |
| `domain` has **zero** `java.sql.*` imports | No DB types in business logic |
| `infrastructure` โŠฅ `application.routing` | Infra adapters can't call routes directly |

Break any rule and the build fails. No social contracts, only compile errors.

---

## ๐Ÿงฌ The COPY Protocol: Why We Bypass `INSERT VALUES`

PostgreSQL has two fundamentally different write paths. Most ORMs use the slower one. Prism uses the faster one.

> **๐ŸŽฌ The 5ร— speedup you get by ignoring your instincts**

```text
โŒ INSERT VALUES (what JPA / Hibernate / Spring Data give you)
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
INSERT INTO transactions VALUES ($1, $2, $3);
INSERT INTO transactions VALUES ($4, $5, $6);
INSERT INTO transactions VALUES ($7, $8, $9);
... ร— 200

Each row:
๐Ÿ”„ parse SQL
๐Ÿ“‹ plan query
๐Ÿ”’ acquire lock
๐Ÿ’พ write WAL
๐Ÿ“ update index
โœ… commit row

200 rows ร— overhead = ๐Ÿ’€ slow

โœ… COPY FROM STDIN (what pgjdbc's CopyManager gives you)
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
COPY staging_transactions (signature, slot, success) FROM STDIN (FORMAT TEXT);
5Kx7a... 312701 t
6Lm8b... 312701 t
7Nv9c... 312701 t
... ร— 200
\.

INSERT INTO transactions SELECT * FROM staging_transactions
ON CONFLICT (signature) DO NOTHING;
TRUNCATE staging_transactions;

One batch:
๐Ÿ”„ parse SQL once
๐Ÿ“‹ plan query once
๐Ÿš€ stream 200 rows over STDIN
๐Ÿ’พ one WAL flush
๐Ÿ“ index update once
โœ… commit batch

5-10ร— faster on the hottest table ๐Ÿ”ฅ
```

**Why a staging table?** `COPY` doesn't support `ON CONFLICT`. So we:

1. `COPY` into `staging_transactions` (no constraints, no indexes, pure speed)
2. `INSERT ... SELECT ... ON CONFLICT (signature) DO NOTHING` from staging โ†’ main
3. `TRUNCATE staging_transactions` and repeat

The staging merge costs an extra statement, but `COPY` + merge is still ~5ร— faster than individual `INSERT`s because the expensive parts โ€” parsing, planning, locking, WAL โ€” amortize across 200 rows.

> **๐Ÿ’ก Secondary tables** (`failed_transactions`, `memos`, `large_transfers`) are low volume, so they use plain `PreparedStatement.addBatch()` with `reWriteBatchedInserts=true` on the pgjdbc URL. The driver rewrites `INSERT ... VALUES ($1, $2)` batches into a single `INSERT ... VALUES ($1, $2), ($3, $4), ...` statement โ€” nearly `COPY`-level throughput without the staging dance.

---

## ๐Ÿงต Virtual Threads: Why No Reactor, No WebFlux, No Spring Boot

Traditional Java servers tried to solve the C10K problem with reactive streams:

```java
// Reactive way โ€” every I/O op is a callback in a chain
return webClient.get()
.uri("/slot")
.retrieve()
.bodyToMono(Slot.class)
.flatMap(slot -> repo.findBySlot(slot))
.flatMap(txs -> Flux.fromIterable(txs)
.parallel()
.runOn(Schedulers.boundedElastic())
.map(this::process)
.sequential()
.collectList())
.onErrorResume(e -> Mono.error(new IndexerException(e)));
```

That's fine code. It's also impossible to debug, step through, or reason about at 3 AM during an incident.

**Virtual Threads (Project Loom, finalized in JDK 21) change the rules.** You can write plain blocking code and the JVM parks the virtual thread on any I/O wait โ€” no OS thread is held, no carrier is pinned, no Schedulers, no operators:

```java
// Loom way โ€” boring, blocking, testable
var slot = httpClient.get("/slot", Slot.class);
var txs = repo.findBySlot(slot);
for (var tx : txs) {
process(tx);
}
```

One virtual thread per job. The JVM multiplexes millions onto a handful of carrier threads. Helidon 4 SE was built from the ground up on this model โ€” it has no Netty event loop, no Servlet container, no CDI graph to warm up. Startup is under 100 ms and p99.999 latency is under 7 ms.

**Prism's one rule:** never call `synchronized`, always use `ReentrantLock`. `synchronized` pins a virtual thread to its carrier and kills throughput. ArchUnit enforces this at build time.

---

## ๐ŸŒŠ Dual Streaming Modes: Free or Fast

Prism can consume Solana's transaction stream in two ways โ€” **same domain, same batching, same persistence** โ€” through a pluggable `TransactionStream` port.

```text
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ TransactionStream port โ”‚
โ”‚ (domain interface) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ๐Ÿ†“ WebSocket โ”‚ โ”‚ ๐Ÿ’ฐ Yellowstone โ”‚
โ”‚ blockSubscribe โ”‚ โ”‚ gRPC (Geyser) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
```

| | ๐Ÿ†“ WebSocket mode | ๐Ÿ’ฐ gRPC mode |
|---|---|---|
| **Endpoint** | `wss://api.mainnet-beta.solana.com` | Paid Yellowstone provider |
| **Protocol** | JSON-RPC `blockSubscribe` over WS | Protobuf over HTTP/2 |
| **Cost** | **$0** โ€” public RPC | **$300-500/mo** typical |
| **Latency** | Higher โ€” JSON parse + `confirmed` commitment | Lower โ€” native protobuf + direct Geyser |
| **Throughput** | Lower โ€” JSON overhead | Higher โ€” 8 MB HTTP/2 window |
| **Stability** | Public RPC can be flaky | Dedicated, SLA-backed |
| **Vote filtering** | Client-side (check Vote program) | Server-side (`vote: false` filter) |
| **Tx data** | `encoding: "jsonParsed"`, full | Raw protobuf (richer) |
| **When to use** | Dev, testnet, hobby projects, low TPS | Production, mainnet, real workloads |

Switch with a single env var โ€” `STREAM_MODE=websocket` (default) or `STREAM_MODE=grpc`. The domain layer doesn't know or care.

> **โš ๏ธ Known HTTP/2 limitation (Helidon 4.4):** The stream-level window is a configurable 8 MiB, but Helidon doesn't yet expose the connection-level window (defaults to 64 KiB per RFC 7540). In practice, Helidon emits `WINDOW_UPDATE` frames as data is consumed, so throughput is gated by consumption speed โ€” not a static cap. Tracked in the `docs/implementation-plan.md` for a future revisit.

---

## ๐Ÿชฃ Dual-Trigger Batching: Size OR Time

The worst thing you can do to a write-heavy Postgres workload is flush one row at a time. The second-worst thing is to wait forever for a batch that never fills up.

Prism uses **dual-trigger batching** โ€” flush when *either* threshold fires.

```text
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ๐Ÿ“ฆ TransactionBatchService โ”‚
โ”‚ โ”‚
โ”‚ Buffer: [ ๐ŸŸฆ ๐ŸŸฆ ๐ŸŸฆ ๐ŸŸฆ ๐ŸŸฆ ... ] โ”‚
โ”‚ โ”‚
โ”‚ Trigger A: size โ‰ฅ 200 txs โ”‚
โ”‚ Trigger B: elapsed โ‰ฅ 100 ms โ”‚
โ”‚ โ”‚
โ”‚ Whichever fires first โ†’ FLUSH ๐Ÿšฝ โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

โšก High TPS (40K/s) ๐Ÿชถ Low TPS (100/s) ๐Ÿ’ค Idle (0/s)
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
200 txs in 5 ms 200 txs in 2 sec 0 txs
โ†’ size triggers โ†’ time triggers โ†’ no flush
โ†’ flush every 5 ms โ†’ flush every 100 ms โ†’ buffer stays empty
~200ร— fewer round-trips bounded max latency no wasted writes
```

**The numbers:**

| Scenario | TPS in | Batches/s | DB round-trips/s | Max write latency |
|---|---|---|---|---|
| Mainnet burst | 40,000 | ~200 | ~200 | 5 ms |
| Typical mainnet | 4,000 | ~40 | ~40 | 50 ms |
| Quiet dev chain | 100 | 10 | 10 | 100 ms |

Compare that to naive per-row writes at 40K TPS: **40,000 round-trips per second**. The database would melt.

**Account batching uses the same pattern with different thresholds** (200 / 2,000 ms) because account upserts are less latency-sensitive and dedup well โ€” the same `pubkey` often appears multiple times in a 2-second window, and we keep the one with the highest slot in memory before sending a single `UPSERT`.

---

## ๐Ÿ” The Reconnect Dance

Solana RPC endpoints โ€” whether free public or paid Yellowstone โ€” will drop you. Count on it. Here's what happens when they do:

```text
T+0s ๐ŸŒŠ stream is flowing... transactions pouring in
T+120s ๐Ÿ’ฅ stream ends unexpectedly (TCP reset / GOAWAY / network blip)
โ”‚
โ”‚ ๐Ÿงฎ attempt 1: delay = 2 ร— 2ยน = 4 s
โ–ผ
T+124s ๐Ÿ”„ retry โ†’ connected! streaming resumes
T+184s โœ… 60 s of stable flow โ†’ attempt counter resets to 0
โ”‚
โ”‚ ๐ŸŽ‰ next disconnect will start at 4 s again
โ”‚
โ–ผ
...
T+900s ๐Ÿ’ฅ another disconnect
โ”‚ attempt 1: 4 s โ†’ fails
โ”‚ attempt 2: 8 s โ†’ fails
โ”‚ attempt 3: 16 s โ†’ fails
โ”‚ attempt 4: 30 s (capped) โ†’ connects
โ–ผ
T+958s โœ… back online, counter resets after 60 s stable
```

**Formula:** `delay = base ร— 2^min(attempt, 4)` where `base = 2 s`, capped at `30 s`.

| Attempt | Computed | Actual Delay |
|---|---|---|
| 1 | 4 s | **4 s** |
| 2 | 8 s | **8 s** |
| 3 | 16 s | **16 s** |
| 4 | 32 s | **30 s** (capped) |
| 5+ | 32 s | **30 s** (capped) |

**Reset rule:** after **60 seconds** of stable connection, attempt counter resets to 0 โ€” so transient blips don't accumulate into slow restarts.

The same `ReconnectHandler` is shared by both the gRPC and WebSocket adapters. One strategy, two transports.

---

## ๐Ÿ› ๏ธ Tech Stack

| Component | Choice | Version | Why |
|-----------|--------|---------|-----|
| **Runtime** | Java + Virtual Threads | 25 LTS | Scoped Values finalized, +291% VT throughput vs JDK 21 |
| **HTTP server** | Helidon 4 SE | 4.4.0 | Built on VTs from the ground up, <7 ms p99.999, <50 MB RSS, <100 ms startup, no CDI/reflection |
| **gRPC client** | Helidon 4 SE gRPC | 4.4.0 | Built-in HTTP/2 engine, VT-native, no grpc-java |
| **DI** | Avaje Inject | latest | Compile-time codegen, JSR-330 (`@Singleton`), zero reflection |
| **DB driver** | pgjdbc | 42.7+ | `CopyManager` + `reWriteBatchedInserts=true` |
| **Connection pool** | HikariCP ร— 2 | 7.x | Dual pools: write (20) + read (20) |
| **JSON** | Jackson | 2.18+ | Helidon native media support |
| **Migrations** | Flyway (standalone) | 12.x | No Spring integration, runs in `main()` |
| **Resilience** | Resilience4j | 2.3+ | Reconnect backoff strategy |
| **Metrics** | Micrometer + Prometheus | 1.14+ | Native Helidon integration |
| **Mapping** | MapStruct | 1.6.3 | Compile-time, `componentModel = "jsr330"` |
| **Architecture tests** | ArchUnit | 1.4.1 | Hexagonal rules enforced at build time |
| **Logging** | SLF4J + Logback | โ€” | Structured via `@Slf4j` |
| **Testing** | JUnit 5 + Mockito BDD + AssertJ + Testcontainers + Awaitility | โ€” | Three source sets: unit, integration, fixtures |
| **Build** | Gradle (Kotlin DSL) + convention plugins | 9.0 | `prism.service` + `prism.library` in `buildSrc/` |

### โŒ What Prism Explicitly Does Not Use

| Avoided | Replacement | Why |
|---|---|---|
| **Spring Boot** | Helidon 4 SE + `public static void main` | No classpath scanning, no reflection, <100 ms startup |
| **Spring Data JPA** | Raw pgjdbc + `CopyManager` | `COPY FROM STDIN` is 5-10ร— faster than `saveAll()` |
| **`@Autowired`** | Avaje `@Singleton` + Lombok `@RequiredArgsConstructor` | Constructor injection only |
| **`@ConfigurationProperties`** | `IndexerConfig` record + `System.getenv()` | Fail-fast parsing, no magic binding |
| **`@RestController`** | Helidon SE `HttpService` functional routing | No annotations, pure function composition |
| **`synchronized`** | `java.util.concurrent.locks.ReentrantLock` | `synchronized` pins virtual threads to carrier threads |
| **`System.out`/`println`** | `@Slf4j` everywhere | Structured logs only |
| **Comments/Javadoc** | Self-documenting code | If a method needs a comment, rename it |

---

## ๐Ÿงฑ Module Structure

```text
prism/ โ† root project
โ”‚
โ”œโ”€โ”€ buildSrc/ โ† Gradle convention plugins
โ”‚ โ””โ”€โ”€ src/main/kotlin/
โ”‚ โ”œโ”€โ”€ prism.service.gradle.kts โ† applied to main service
โ”‚ โ””โ”€โ”€ prism.library.gradle.kts โ† applied to shared libs
โ”‚
โ”œโ”€โ”€ prism/ โ† main service (Helidon 4 SE)
โ”‚ โ””โ”€โ”€ src/
โ”‚ โ”œโ”€โ”€ main/java/com/stablebridge/prism/
โ”‚ โ”‚ โ”œโ”€โ”€ application/ โ† inbound adapters
โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ IndexerApplication.java โ† main(), wiring
โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ IndexerLifecycle.java โ† shutdown hook
โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ config/IndexerConfig.java โ† env โ†’ record
โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ route/ โ† Helidon SE routes
โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ HealthRoutes.java
โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ StatsRoutes.java
โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ TransactionRoutes.java
โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ TransferRoutes.java
โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ MemoRoutes.java
โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ AccountRoutes.java
โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ CorsConfiguration.java
โ”‚ โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ PaginationLimits.java
โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ mapper/ โ† MapStruct
โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ error/GlobalErrorHandler.java
โ”‚ โ”‚ โ”‚
โ”‚ โ”‚ โ”œโ”€โ”€ domain/ โ† core โ€” zero framework imports
โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ model/ โ† SolanaTransaction, Account, ...
โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ port/ โ† TransactionStream, TransactionRepository, ...
โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ service/ โ† BatchService, Processor, filters
โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ solana/ โ† Base58, balance math, programs
โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ exception/
โ”‚ โ”‚ โ”‚
โ”‚ โ”‚ โ””โ”€โ”€ infrastructure/ โ† outbound adapters
โ”‚ โ”‚ โ”œโ”€โ”€ grpc/ โ† Yellowstone stream + parser
โ”‚ โ”‚ โ”œโ”€โ”€ websocket/ โ† blockSubscribe stream + parser
โ”‚ โ”‚ โ”œโ”€โ”€ persistence/ โ† pgjdbc repositories
โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ DataSourceFactory.java โ† HikariCP ร— 2
โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ FlywayMigrator.java
โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ CopyTransactionRepository.java
โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ JdbcFailedTransactionRepository.java
โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ JdbcMemoRepository.java
โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ JdbcTransferRepository.java
โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ JdbcAccountRepository.java
โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ JdbcStatsRepository.java
โ”‚ โ”‚ โ”œโ”€โ”€ solana/Base58.java
โ”‚ โ”‚ โ”œโ”€โ”€ metrics/
โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ MicrometerMetricsRecorder.java
โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ BenchmarkLogReporter.java
โ”‚ โ”‚ โ””โ”€โ”€ console/ConsoleOutputFormatter.java
โ”‚ โ”‚
โ”‚ โ””โ”€โ”€ main/resources/
โ”‚ โ””โ”€โ”€ db/migration/ โ† Flyway V1, V2, V3 ...
โ”‚
โ”œโ”€โ”€ prism-api/ โ† shared DTOs (java-library)
โ”‚ โ””โ”€โ”€ src/main/java/.../api/model/ โ† Page, TransactionResponse, ...
โ”‚
โ”œโ”€โ”€ docs/
โ”‚ โ”œโ”€โ”€ functional-spec.md โ† single source of truth
โ”‚ โ”œโ”€โ”€ implementation-plan.md โ† phases 0-7
โ”‚ โ”œโ”€โ”€ CODING_STANDARDS.md
โ”‚ โ””โ”€โ”€ TESTING_STANDARDS.md
โ”‚
โ”œโ”€โ”€ infra/
โ”‚ โ””โ”€โ”€ prometheus.yml โ† scrape config
โ”‚
โ”œโ”€โ”€ docker-compose.yml โ† Postgres + Prometheus + Grafana + app
โ”œโ”€โ”€ Makefile โ† developer workflow automation
โ”œโ”€โ”€ build.gradle.kts / settings.gradle.kts
โ””โ”€โ”€ CLAUDE.md โ† agent instructions
```

---

## ๐Ÿš€ Quick Start

### Prerequisites

- ๐Ÿ˜ **Docker & Docker Compose** (for PostgreSQL, Prometheus, Grafana)
- โ˜• **Java 25** (for local builds)
- ๐Ÿ› ๏ธ **Make** (optional, just thin wrappers around `./gradlew` and `docker compose`)

### 60-Second Onboarding

```bash
# 1๏ธโƒฃ Clone
git clone https://github.com/Puneethkumarck/prism.git
cd prism

# 2๏ธโƒฃ Boot the infrastructure (Postgres + Prometheus + Grafana)
make infra-up

# 3๏ธโƒฃ Run Prism against the free public Solana WebSocket endpoint
# Defaults: STREAM_MODE=websocket, RPC_WS_ENDPOINT=wss://api.mainnet-beta.solana.com
DATABASE_URL=postgresql://indexer:indexer@localhost:5432/indexer \
make run

# 4๏ธโƒฃ In another terminal, watch it work
curl -s http://localhost:3000/health | jq
curl -s http://localhost:3000/api/stats | jq
curl -s "http://localhost:3000/api/transactions?limit=5" | jq
curl -s "http://localhost:3000/api/transfers?min_amount=1.0&limit=10" | jq
```

Within a few seconds you'll see `[SLOT]`, `[TX]`, `[MEMO]`, and `[TRANSFER]` events streaming to stdout, and rows accumulating in Postgres.

### Switching to Paid gRPC Mode

```bash
STREAM_MODE=grpc \
GRPC_ENDPOINT=https://.com \
X_TOKEN= \
DATABASE_URL=postgresql://indexer:indexer@localhost:5432/indexer \
make run
```

### Run Everything in Docker

```bash
make up # builds image via Jib + starts Postgres + Prometheus + Grafana + Prism
make down # stops it all
```

---

## ๐ŸŽ›๏ธ Make Targets

| Target | Description |
|---|---|
| `make build` | Compile + Spotless + unit + integration + ArchUnit |
| `make test` | Unit tests only |
| `make integration-test` | Integration tests (requires Docker for Testcontainers) |
| `make clean` | Remove all build artifacts |
| `make format` | Auto-format with Spotless |
| `make lint` | Spotless check + ArchUnit (matches pre-commit hook) |
| `make run` | Run Prism locally via Gradle |
| `make infra-up` | Start Postgres + Prometheus + Grafana |
| `make infra-down` | Stop infrastructure |
| `make infra-clean` | Stop + delete volumes |
| `make infra-status` | Show infra container status |
| `make infra-logs` | Tail infrastructure logs |
| `make docker-build` | Build Docker image via Jib (no Dockerfile) |
| `make up` | Start infra + app container |
| `make down` | Stop everything |
| `make setup-hooks` | Point git at `.githooks/` |
| `make help` | List all targets |

---

## ๐ŸŒ API Reference

Base URL: `http://localhost:3000` โ€” no authentication (v1).
Metrics on `http://localhost:9090/metrics` (Prometheus format).

### ๐Ÿฉบ Health

```http
GET /health
```

```json
{ "status": "ok", "uptime_secs": 3600 }
```

### ๐Ÿ“Š Stats

```http
GET /api/stats
```

Uses `pg_stat_user_tables.n_live_tup` for **O(1) approximate counts** โ€” vastly faster than `COUNT(*)` on million-row tables.

```json
{
"total_transactions": 4_812_344,
"total_failed": 1_203_111,
"total_transfers": 38_201,
"total_memos": 914_102,
"total_accounts": 87_433
}
```

### ๐Ÿงพ Transactions

| Method | Path | Query Params | Notes |
|---|---|---|---|
| `GET` | `/api/transactions` | `limit` (default 50, max 500), `offset`, `success` (optional bool) | Paginated, `created_at DESC` |
| `GET` | `/api/transactions/{signature}` | โ€” | Returns `TxRow` or `404` |
| `GET` | `/api/slots/{slot}` | โ€” | Array, not paginated, `created_at ASC` |

```bash
# List the latest 10 successful transactions
curl -s "http://localhost:3000/api/transactions?limit=10&success=true" | jq

# Look up one by signature
curl -s "http://localhost:3000/api/transactions/5Kx7aLm..." | jq

# All transactions in slot 312_701_542
curl -s "http://localhost:3000/api/slots/312701542" | jq
```

### ๐Ÿ’ฐ Large Transfers

```http
GET /api/transfers?limit=50&offset=0&min_amount=10.0
```

Paginated, ordered by `amount DESC`. Threshold is configurable; default `1.0 SOL`.

### ๐Ÿ“ Memos

```http
GET /api/memos?limit=50&offset=0
```

Paginated, ordered by `created_at DESC`.

### ๐Ÿง Accounts

```http
GET /api/accounts/{pubkey}
```

Returns the most recent balance snapshot for a fee payer, or `404`.

### ๐Ÿšจ Error Response Format

```json
{
"error": "transaction not found",
"status": 404
}
```

| Status | Meaning |
|---|---|
| `400` | Validation error (invalid base58, out-of-range pagination) |
| `404` | Resource not found (signature / pubkey) |
| `500` | Internal error (DB unreachable, etc.) |

---

## โš™๏ธ Configuration Reference

Every setting is an environment variable. `IndexerConfig` is a Java record parsed via `System.getenv()` โ€” fail-fast on missing required vars, no binding magic, no `application.yml`.

| Variable | Required | Default | Description |
|---|---|---|---|
| `DATABASE_URL` | โœ… Yes | โ€” | `postgresql://user:pass@host:port/db` |
| `STREAM_MODE` | No | `websocket` | `websocket` (free) or `grpc` (paid) |
| `RPC_WS_ENDPOINT` | if `STREAM_MODE=websocket` | `wss://api.mainnet-beta.solana.com` | Solana WebSocket RPC URL |
| `GRPC_ENDPOINT` | if `STREAM_MODE=grpc` | โ€” | Yellowstone gRPC endpoint (https required except localhost) |
| `X_TOKEN` | No | โ€” | Auth token for Yellowstone, injected as `x-token` metadata |
| `API_PORT` | No | `3000` | Helidon HTTP port |
| `CONSOLE_LOG` | No | `true` | `false` or `0` suppresses `[TX]`/`[MEMO]`/`[TRANSFER]` output |
| `BENCH_LOG` | No | `benchmark.log` | Path for 5-minute benchmark summary file |

**Fail-fast validation** (in `IndexerConfig.fromEnv()`):
- `DATABASE_URL` is always required.
- `STREAM_MODE` must be `websocket` or `grpc` (case-insensitive).
- `GRPC_ENDPOINT` must be a valid URI with `https` scheme (or `http` for localhost).
- `API_PORT` must be a non-negative integer โ‰ค 65535.

---

## ๐Ÿ“Š Observability

| Layer | Technology | Details |
|---|---|---|
| **Metrics** | Micrometer + Prometheus | 8 counters (`indexer_tx_received`, `..._written`, `..._failed`, `..._memo`, `..._transfer`, `..._accounts_written`, `..._slots`, `..._batches`) scraped at `:9090/metrics` |
| **Dashboards** | Grafana | Runs at `http://localhost:3001` (admin/admin), Prometheus at `http://localhost:9091` |
| **Benchmark log** | File appender | Every 5 minutes: `timestamp | tps | recv | written | failed | failed% | memos | xfers | accts | batches | slots` |
| **Console output** | ANSI color coded | `[SLOT]` cyan ยท `[TX]` white/red ยท `[MEMO]` magenta ยท `[TRANSFER]` yellow |
| **Health** | Helidon | `GET /health` โ€” no DB call, uptime from process start |

Sample benchmark log line:

```text
2026-04-11T09:17:42Z | 412 | 4120 | 2581 | 1539 | 37% | 18 | 104 | 31200 | 42 | 12
```

Mainnet is a *chaotic* environment โ€” a 37% failure rate is completely normal (bots, MEV, failed swaps). The indexer records all of it.

---

## ๐Ÿงช Testing Strategy

Three-tier pyramid, with conventions adapted from `stablebridge-tx-recovery`:

```text
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Integration โ”‚ Testcontainers PostgreSQL, real JDBC,
โ”‚ (Docker) โ”‚ end-to-end against both stream adapters
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Architecture โ”‚ ArchUnit: hexagonal layer rules,
โ”‚ (no deps) โ”‚ no @Autowired, no System.out, no synchronized
โ”Œโ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”
โ”‚ Unit Tests โ”‚ BDD Mockito + AssertJ, no Spring context,
โ”‚ (single assertion) โ”‚ fixture builders, no generic matchers
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
```

| Tier | Source Set | Frameworks | Docker? |
|---|---|---|---|
| Unit | `src/test/` | JUnit 5, BDD Mockito (`given`/`then`), AssertJ, Awaitility | No |
| Architecture | `src/test/` | ArchUnit 1.4 | No |
| Integration | `src/integration-test/` | JUnit 5, Testcontainers PostgreSQL, direct JDBC | โœ… Yes |

**Non-negotiable testing rules:**

- ๐ŸŽฏ **Single-assert pattern** โ€” build expected object, then `assertThat(actual).usingRecursiveComparison().isEqualTo(expected)`.
- ๐Ÿ—ฃ๏ธ **BDD Mockito only** โ€” `given().willReturn()` / `then().should()`, never `when()/verify()`.
- ๐Ÿšซ **No generic matchers** โ€” no `any()`, `anyString()`, `eq()`. Pass actual values.
- ๐Ÿ’ฌ **`// given` / `// when` / `// then` comments in every test**.
- ๐Ÿ—๏ธ **Fixture builders** โ€” `SOME_*` constants and `Builder()` in `src/testFixtures/`.
- โฑ๏ธ **Awaitility over `Thread.sleep`** โ€” polling with timeout, not arbitrary waits.

Run them:

```bash
./gradlew test # unit + architecture (~5 s)
./gradlew integrationTest # integration tests (requires Docker, ~30 s)
./gradlew build # everything + Spotless + ArchUnit
```

---

## ๐Ÿ—‚๏ธ Database Schema

Five tables, seven indexes, one staging table. All migrations live in `prism/src/main/resources/db/migration/`.

```text
๐Ÿ“„ transactions primary key: signature (varchar 88)
signature varchar(88) PK
slot bigint idx_transactions_slot
success boolean idx_transactions_success
created_at timestamptz idx_transactions_created_at DESC

โŒ failed_transactions
id serial PK
signature varchar(88)
slot bigint
error text
created_at timestamptz idx_failed_tx_created_at DESC

๐Ÿ’ฐ large_transfers
id serial PK
signature varchar(88)
slot bigint
amount numeric idx_large_transfers_amount DESC
created_at timestamptz idx_large_transfers_created_at DESC

๐Ÿ“ memos
id serial PK
signature varchar(88)
memo text
created_at timestamptz idx_memos_created_at DESC

๐Ÿง accounts unique: pubkey
id serial PK
pubkey varchar(88) UNIQUE
lamports bigint
slot bigint
executable boolean
rent_epoch bigint
created_at timestamptz

๐Ÿ› ๏ธ staging_transactions (no constraints, no indexes)
signature varchar(88)
slot bigint
success boolean โ† COPY target, truncated per flush
```

### Dual Connection Pools

```text
HikariCP Write Pool HikariCP Read Pool
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
max: 20 max: 20
min: 5 min: 5
usage: usage:
COPY staging_tx GET /api/transactions
INSERT failed_tx GET /api/transfers
INSERT memos GET /api/memos
INSERT large_transfers GET /api/accounts/:pubkey
UPSERT accounts GET /api/stats
```

**Why two pools?** During a mainnet burst the write pool can saturate all 20 connections for 50-100 ms. If the API shared that pool, every `GET` would sit in line behind the writes. With dedicated pools, API latency is independent of ingest load โ€” the primary complaint about every "indexer plus API on one DB" setup.

---

## ๐Ÿง  Design Decisions, Quick Reference

| # | Decision | Problem It Solves | Impact |
|---|---|---|---|
| 1 | **Unbounded tx queue** (`LinkedTransferQueue`) | Bounded queues cause producer block โ†’ Yellowstone `lagged` disconnect | Zero dropped transactions from backpressure |
| 2 | **COPY FROM STDIN + staging merge** | `INSERT VALUES` is 5-10ร— slower for high-volume tables | 5-10ร— write throughput on the hottest path |
| 3 | **200 tx / 100 ms dual-trigger batch** | Per-row writes create ~200ร— more DB round-trips | ~200ร— fewer round-trips, bounded max latency |
| 4 | **200 acct / 2 s batch with dedup** | Per-tx account upserts spawn thousands of tasks/sec | Eliminates task churn, reduces DB pressure |
| 5 | **Exponential reconnect** (4sโ†’8sโ†’16sโ†’30s cap, 60s reset) | Thundering herd against a flapping endpoint | Progressive delay, fast recovery after stability |
| 6 | **Dual read/write HikariCP pools** | Write bursts starve API read queries | API latency independent of ingest load |
| 7 | **`pg_stat_user_tables` for `/api/stats`** | `COUNT(*)` on million-row tables is O(N) | O(1) approximate counts for dashboard |
| 8 | **4 parallel writes per flush** | Sequential writes to 4 tables multiply flush latency | All 4 writes (COPY + 3 INSERTs) run concurrently on virtual threads |
| 9 | **Helidon 4 SE (not Boot)** | Spring classpath scan, reflection, CDI โ†’ slow startup | <100 ms startup, <50 MB RSS, <7 ms p99.999 |
| 10 | **`ReentrantLock` (not `synchronized`)** | `synchronized` pins virtual threads to carrier | No carrier pinning, full VT throughput |
| 11 | **MapStruct at layer boundaries** | Hand-written copy loops are bug-prone and ugly | Compile-time, type-safe, zero runtime cost |
| 12 | **ArchUnit at build time** | Architectural rules degrade without enforcement | Hexagonal rules fail the build if violated |

---

## ๐Ÿ“œ License

Released under the **MIT License**. See [`LICENSE`](LICENSE) for full text.

---

### ๐Ÿ”บ Prism โ€” Refract the Solana firehose into a queryable data stream.

Built on **Java 25 ยท Virtual Threads ยท Helidon 4 SE ยท pgjdbc ยท PostgreSQL 16**
No Spring Boot ยท No JPA ยท No reflection ยท No apologies.

*Every block. Every signature. Every memo. Every time.*