{"id":34208737,"url":"https://github.com/rosscartlidge/ssql","last_synced_at":"2026-04-27T10:01:14.738Z","repository":{"id":345695789,"uuid":"1186647619","full_name":"rosscartlidge/ssql","owner":"rosscartlidge","description":"SQL-style stream processing for the command line and Go","archived":false,"fork":false,"pushed_at":"2026-04-21T07:37:15.000Z","size":431802,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-21T09:37:54.332Z","etag":null,"topics":["cli","csv","data-processing","duckdb","go","golang","pipeline","sql","stream-processing","unix","wasm"],"latest_commit_sha":null,"homepage":null,"language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rosscartlidge.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-19T21:06:34.000Z","updated_at":"2026-04-21T07:37:18.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/rosscartlidge/ssql","commit_stats":null,"previous_names":["rosscartlidge/ssql"],"tags_count":171,"template":false,"template_full_name":null,"purl":"pkg:github/rosscartlidge/ssql","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rosscartlidge%2Fssql","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rosscartlidge%2Fssql/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rosscartlidge%2Fssql/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rosscartlidge%2Fssql/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rosscartlidge","download_url":"https://codeload.github.com/rosscartlidge/ssql/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rosscartlidge%2Fssql/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32331305,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-26T23:26:28.701Z","status":"online","status_checked_at":"2026-04-27T02:00:06.769Z","response_time":128,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cli","csv","data-processing","duckdb","go","golang","pipeline","sql","stream-processing","unix","wasm"],"created_at":"2025-12-15T20:24:40.428Z","updated_at":"2026-04-27T10:01:14.727Z","avatar_url":"https://github.com/rosscartlidge.png","language":"Go","readme":"# ssql 🚀\n\n**Modern Go stream processing made simple** - Transform data with intuitive operations, create interactive visualizations, and even generate code from natural language descriptions.\n\nBuilt on Go 1.23+ with first-class support for iterators, generics, and functional composition.\n\n\u003e **⚠️ Important:** ssql v4 requires the `/v4` import path:\n\u003e ```go\n\u003e import \"github.com/rosscartlidge/ssql/v4\"\n\u003e ```\n\n## ✨ What Makes ssql Special\n\n### 🎯 **Simple Yet Powerful**\n\n**Go Library:**\n```go\n// Read data, filter, group, and visualize - all type-safe\nsales, err := ssql.ReadCSV(\"sales.csv\")\nif err != nil {\n    log.Fatal(err)\n}\n\ntopRegions := ssql.Chain(\n    ssql.GroupByFields(\"sales\", \"region\"),\n    ssql.Aggregate(\"sales\", map[string]ssql.AggregateFunc{\n        \"total_revenue\": ssql.Sum(\"amount\"),\n    }),\n    ssql.SortBy(func(r ssql.Record) float64 {\n        return -ssql.GetOr(r, \"total_revenue\", 0.0) // Descending\n    }),\n    ssql.Limit[ssql.Record](5),\n)(sales)\n\nssql.QuickChart(topRegions, \"region\", \"total_revenue\", \"top_regions.html\")\n```\n\n\u003cdetails\u003e\n\u003csummary\u003e💡 \u003cb\u003eClick for complete, runnable code with sample data\u003c/b\u003e\u003c/summary\u003e\n\n```go\npackage main\n\nimport (\n    \"log\"\n    \"os\"\n    \"github.com/rosscartlidge/ssql/v4\"\n)\n\nfunc main() {\n    // Create sample sales data in /tmp/sales.csv\n    csvData := `region,product,amount\nNorth,Widget,1500\nSouth,Gadget,2300\nEast,Widget,1800\nWest,Gadget,2100\nNorth,Gadget,3200\nSouth,Widget,1200\nEast,Gadget,2800\nWest,Widget,1600\nNorth,Widget,2500\nSouth,Gadget,1900\nEast,Widget,2200\nWest,Gadget,3100`\n\n    if err := os.WriteFile(\"/tmp/sales.csv\", []byte(csvData), 0644); err != nil {\n        log.Fatalf(\"Failed to create sample data: %v\", err)\n    }\n\n    // Read data, filter, group, and visualize - all type-safe\n    sales, err := ssql.ReadCSV(\"/tmp/sales.csv\")\n    if err != nil {\n        log.Fatal(err)\n    }\n\n    topRegions := ssql.Chain(\n        ssql.GroupByFields(\"sales\", \"region\"),\n        ssql.Aggregate(\"sales\", map[string]ssql.AggregateFunc{\n            \"total_revenue\": ssql.Sum(\"amount\"),\n        }),\n        ssql.SortBy(func(r ssql.Record) float64 {\n            return -ssql.GetOr(r, \"total_revenue\", 0.0) // Descending\n        }),\n        ssql.Limit[ssql.Record](5),\n    )(sales)\n\n    if err := ssql.QuickChart(topRegions, \"region\", \"total_revenue\", \"/tmp/top_regions.html\"); err != nil {\n        log.Fatalf(\"Failed to create chart: %v\", err)\n    }\n\n    log.Println(\"Chart created: /tmp/top_regions.html\")\n    log.Println(\"Sample data: /tmp/sales.csv\")\n}\n```\n\n\u003c/details\u003e\n\n**Or use the CLI:**\n\n![ssql demo](doc/demo.gif)\n```bash\n# Prototype with Unix-style pipelines, then generate production Go code\nssql from employees.csv | \\\n  ssql group-by dept -count n -avg salary avg_sal | \\\n  ssql to chart -x dept -y avg_sal -output chart.html\n\n# Window functions — rankings, running totals, lag/lead without collapsing rows\nssql from employees.csv | ssql window -row-number rn -partition dept -order salary -desc\n\n# Read multiple files at once (shell expands *.csv)\nssql from csv *.csv -source file | ssql group-by file -count n | ssql to table\n\n# Multi-file pushdown — filter per file in parallel, then merge (4x faster)\nssql from csv *.csv -- where -if age gt 25 | ssql to table\n\n# Schema headers are automatic - preserves field order through pipelines\nssql from data.csv | ssql where -if age gt 30 | ssql to csv output.csv\n\n# High-performance Arrow format (10-20x faster I/O)\nssql from data.arrow | ssql where -if age gt 30 | ssql to arrow output.arrow\n\n# Excel files — read and write .xlsx directly\nssql from xlsx sales.xlsx -sheet \"Q4 Results\" | ssql where -if revenue gt 50000 | ssql to xlsx top.xlsx\n\n# Distributed processing: read remote files via SSH\nssql from ssh myserver /data/events.csv -- where -if status eq error | ssql to table\n\n# Read multiple shards from a catalog, with partition pruning\nssql from catalog shards.csv -if date ge 2025-03-01 | ssql group-by service -count n\n\n# Optimize a pipeline — push filters into SSH, collapse sort+limit to top\n(export SSQLGO=1; ssql from ssh node1 /data/events.csv \\\n  | ssql where -if status ge 500 \\\n  | ssql sort -desc cnt | ssql limit 10 \\\n  | ssql to table) | ssql generate ssql\n# → ssql from ssh node1 /data/events.csv -- where -if status ge 500 | ssql top 10 -field cnt | ssql to table\n\n# Chain: optimize → then compile to Go\n(export SSQLGO=1; ...) | ssql generate ssql | ssql generate go\n\n# Debug pipelines with jq (JSONL streaming format)\nssql from data.csv | jq '.' | head -5  # Inspect data\nssql from data.csv | ssql where -if age gt 30 | jq -s 'length'  # Count results\n```\n\n**Optimize and compile to Go:**\n\n![ssql optimize demo](doc/demo-optimize.gif)\n\n[**Try the CLI →**](doc/cli-codelab.md) | [**Debug with jq →**](doc/cli-debugging.md)\n\n### ⚡ **High-Performance Typed Pipelines** — `ssql/typed`\n\nWhen the schema is known at compile time and the pipeline is hot, the\n`ssql/typed` subpackage gives you a struct-based fast path with the same\nshape as the main API. Measured against the same 10M row × 3 chained\njoin workload:\n\n| Implementation | Time | Memory | Allocations |\n|---|---:|---:|---:|\n| `ssql.Record` (current) | 74.8 s | 37.7 GB | 544 M |\n| **`ssql/typed`** | **4.94 s** | **1.10 GB** | **20 M** |\n| DuckDB v1.5 CLI | 0.42 s | — | — |\n\n**15× faster, 34× less memory** vs the Record API — within an order of\nmagnitude of DuckDB, in pure Go with zero CGO and ~600 LOC on the data\npath. Same iter.Seq[T] composition shape as the main API:\n\n```go\ntype Employee struct {\n    Name   string\n    DeptID string `ssql:\"dept_id\"`\n    Years  int64\n}\n\ntype Department struct {\n    DeptID   string `ssql:\"dept_id\"`\n    DeptName string `ssql:\"dept_name\"`\n}\n\nemployees := typed.ReadCSV[Employee](\"employees.csv\")\ndepts     := typed.ReadCSV[Department](\"departments.csv\")\n\nseniors := typed.Where(func(e Employee) bool {\n    return e.Years \u003e= 5\n})(employees)\n\njoined := typed.HashJoin(seniors, depts,\n    func(e Employee) string   { return e.DeptID },\n    func(d Department) string { return d.DeptID },\n    func(e Employee, d Department) Senior { ... })\n```\n\nUse `ssql.Record` for prototyping and dynamic schemas; switch to\n`ssql/typed` when you know your schema and the pipeline is hot.\n\n**Or skip the rewrite entirely** — `ssql generate go -typed` translates a\nshell pipeline directly into a typed Go program with auto-derived struct\ntypes. The same prototype pipeline you'd run interactively becomes a\nself-contained, compiled, schema-safe binary:\n\n```bash\nSSQLGO=typed ssql from employees.csv \\\n    | ssql where -if years ge 5 \\\n    | ssql join departments.csv -using dept_id \\\n    | ssql to csv seniors.csv \\\n    | ssql generate go \u003e pipeline.go\ngo run pipeline.go\n```\n\nMeasured against the same shell pipeline run three ways (1M rows ×\n1 join, see `cmd/ssql/codegen_bench_test.go`):\n\n| Mode | Wall time | Peak RSS |\n|---|---:|---:|\n| CLI pipeline (interactive) | 3.08 s | 33 MB |\n| `SSQLGO=1` codegen (Record) | 2.69 s | 910 MB |\n| **`SSQLGO=typed` codegen** | **0.77 s** | **8.7 MB** |\n| Typed vs CLI | **4.0× faster** | — |\n| Typed vs Record codegen | **3.5× faster** | **104× less memory** |\n\n**Need more speed? `SSQLGO=parallel` runs the same pipeline across\nall cores.** Drop-in replacement for `SSQLGO=typed`; emits Go code\nthat uses a shard-partitioned `Stream[T]` runtime with parallel\nCSV read, parallel `Where`, parallel hash-join, parallel group-by\n(Sink/Combine/Finalize), and per-shard CSV output buffers. Measured\non a 32-core machine, 10 M-row corpus:\n\n| Workload | typed-serial | **typed-parallel** | DuckDB |\n|---|---:|---:|---:|\n| Filter + write 7.25 M-row CSV | 5.7 s | **1.3 s (4.4× faster)** | 0.7 s |\n| Group-by 1 000 dept_ids, count + sum + avg + min + max | 3.80 s | **0.95 s (4.0× faster)** | 0.39 s |\n\n```bash\nSSQLGO=parallel ssql from data.csv \\\n    | ssql group-by dept_id -count n -sum salary total -avg salary mean \\\n    | ssql to csv | ssql generate go \u003e pipeline.go\ngo run pipeline.go\n```\n\nUse `SSQLGO=parallel` when the host has spare cores and the\npipeline fits the supported subset (`from`, `where`, `join`,\n`group-by`, `to csv`, `to table`); fall back to `SSQLGO=typed` for\noutput-too-large-for-RAM cases or when you need strict input-order\noutput.\n\n[**Codelab →**](doc/typed-codelab.md) | [**Reference →**](doc/typed-reference.md) | [**Codegen design →**](doc/research/typed-codegen-proposal.md) | [**GroupByParallel design →**](doc/research/typed-groupby-parallel-proposal.md)\n\n### 🌐 **Browser Playground**\n\nTry ssql without installing anything — the full CLI runs in your browser via WebAssembly:\n\n**[Launch Playground →](https://rosscartlidge.github.io/ssql/playground.html)** *(instant — optimized WASM, ~13MB)*\n\n**[Launch Full Terminal →](https://rosscartlidge.github.io/ssql-terminal/)** *(real Linux with bash, tab completion, pipes — boots in ~20s)*\n\nOr build the playground locally:\n```bash\nmake playground\ncd cmd/ssql-playground \u0026\u0026 python3 -m http.server 8080\n# Open http://localhost:8080/playground.html\n```\n\nFeatures:\n- Type real ssql pipelines and see results instantly\n- **Optimize** — see the pipeline optimizer rewrite your commands with `-explain`\n- **Generate Go** — compile pipelines to standalone Go code\n- **Generate SQL** — convert to DuckDB-compatible SQL\n- **Process substitution** — `\u003c(ssql from ... | ssql where ...)` works in joins\n- Sample datasets included (employees, orders, customers)\n- Upload your own CSV files\n\n\u003e **Note:** SSH and catalog commands require network access and are not available in the browser. Use Optimize or Generate Go to see how those pipelines would be rewritten.\n\n### 🤖 **AI-Powered Code Generation**\nDescribe what you want in plain English, get working ssql code:\n\n\u003e *\"Read customer data, find high-value customers, group by region, create a chart\"*\n\n→ **Generates clean, readable Go code automatically**\n\n[**Try the AI Assistant →**](doc/ai-human-guide.md)\n\n### 📊 **Interactive Visualizations**\nCreate modern, responsive charts with zoom, pan, and filtering capabilities:\n\n```bash\nssql from data.csv | ssql group-by dept -avg salary avg_sal | ssql to chart -x dept -y avg_sal -type bar\n```\n\n![ssql chart](doc/chart-screenshot.png)\n\nCharts are self-contained HTML files with Chart.js — interactive controls, trend lines, export to PNG. Also supports animated visualizations (`to animate`) for time-series and frequency spectra.\n\n[**Try charts in the playground →**](https://rosscartlidge.github.io/ssql/playground.html)\n\n## 🚀 Quick Start\n\n### Prerequisites\n- **Go 1.23+** required for iterator support\n\n**Don't have Go installed?**\n- macOS: `brew install go`\n- Linux/Windows: [Download from go.dev](https://go.dev/dl/)\n- Verify: `go version` (should show 1.23+)\n\n### Installation\n\n#### Option 1: Homebrew (macOS \u0026 Linux)\n\n```bash\nbrew tap rosscartlidge/ssql\nbrew install ssql\nssql version\n```\n\n#### Option 2: Go Install\n\n```bash\ngo install github.com/rosscartlidge/ssql/v4/cmd/ssql@latest\n\n# Verify installation\nssql version\n\n# Try it out\necho \"name,age,salary\nAlice,30,95000\nBob,25,65000\" | ssql from csv | ssql where -if age gt 28\n```\n\n[**See CLI Tutorial →**](doc/cli-codelab.md)\n\n#### Option 3: Download Binary\n\nPre-built binaries for all platforms are available on [GitHub Releases](https://github.com/rosscartlidge/ssql/releases). Download the archive for your OS/architecture, extract, and add to your PATH.\n\n#### Option 4: WASI (run anywhere)\n\nA single `.wasm` binary that runs on any platform with a WASI runtime ([wasmtime](https://wasmtime.dev/), wasmer, Docker+WASM):\n\n```bash\n# Download from GitHub Releases\ncurl -LO https://github.com/rosscartlidge/ssql/releases/latest/download/ssql_wasi.tar.gz\ntar xzf ssql_wasi.tar.gz\n\n# Run with wasmtime\nwasmtime ssql.wasm version\nwasmtime --dir=. ssql.wasm from data.csv | wasmtime ssql.wasm where -if age gt 25 | wasmtime ssql.wasm to table\n```\n\nNo Go, no cross-compilation — one binary for every platform. 14MB slim build.\n\n#### Option 5: GPU Acceleration (optional)\n\nFor 10-50x faster FFT, convolution, and correlation on large signals:\n\n**Requirements:**\n- NVIDIA GPU with CUDA support\n- Docker with nvidia-container-toolkit, OR CUDA Toolkit installed locally\n\n**Method 1: Docker Build (Recommended - no local CUDA needed)**\n\n```bash\n# Clone the repository\ngit clone https://github.com/rosscartlidge/ssql.git\ncd ssql\n\n# Build and extract the GPU-enabled binary\nmake docker-gpu-extract\n\n# Install the library system-wide\nsudo cp libssqlgpu.so /usr/local/lib \u0026\u0026 sudo ldconfig\n\n# Install the binary\ncp ssql_gpu ~/go/bin/\n\n# Verify GPU is detected\nssql_gpu version\n# Output: ssql vX.Y.Z (gpu: yes)\n```\n\n**Method 2: Local CUDA Toolkit Build**\n\n```bash\n# Clone the repository\ngit clone https://github.com/rosscartlidge/ssql.git\ncd ssql\n\n# Build the CUDA library\ncd gpu \u0026\u0026 make \u0026\u0026 cd ..\n\n# Build ssql with GPU support\ngo build -tags gpu -o ssql_gpu ./cmd/ssql\n\n# Install to your Go bin directory\nsudo make install-gpu  # Installs libssqlgpu.so to /usr/local/lib\ncp ssql_gpu ~/go/bin/\n\n# Verify GPU is detected\nssql_gpu version\n```\n\n**Note:** The GPU version falls back to CPU automatically when GPU is unavailable or for small datasets where CPU is faster.\n\n#### Option 6: Debian Packages\n\nPre-built `.deb` packages are available for amd64 Linux systems:\n\n**Standard version (no GPU dependencies):**\n```bash\ncurl -LO https://github.com/rosscartlidge/ssql/raw/main/ssql_4.34.0_amd64.deb\nsudo dpkg -i ssql_4.34.0_amd64.deb\nssql version\n```\n\n**GPU-accelerated version (requires NVIDIA CUDA runtime):**\n```bash\ncurl -LO https://github.com/rosscartlidge/ssql/raw/main/ssql-gpu_4.34.0_amd64.deb\nsudo dpkg -i ssql-gpu_4.34.0_amd64.deb\nssql version\n```\n\nThe GPU package requires `libcudart` (CUDA runtime) which is typically installed with NVIDIA drivers.\n\n#### Option 7: Go Library (for application development)\n\n**Step 1: Create a new project**\n```bash\nmkdir my-project\ncd my-project\ngo mod init myproject  # Initialize Go module (required!)\n```\n\n**Step 2: Install ssql v4**\n```bash\ngo get github.com/rosscartlidge/ssql/v4\n```\n\n### Hello ssql\n```go\npackage main\n\nimport (\n    \"fmt\"\n    \"slices\"\n    \"github.com/rosscartlidge/ssql/v4\"\n)\n\nfunc main() {\n    numbers := slices.Values([]int{1, 2, 3, 4, 5, 6, 7, 8, 9, 10})\n\n    evenNumbers := ssql.Where(func(x int) bool {\n        return x%2 == 0\n    })(numbers)\n\n    first3 := ssql.Limit[int](3)(evenNumbers)\n\n    fmt.Println(\"First 3 even numbers:\")\n    for num := range first3 {\n        fmt.Println(num) // 2, 4, 6\n    }\n}\n```\n\n### Your First Chart\n```go\npackage main\n\nimport (\n    \"slices\"\n    \"github.com/rosscartlidge/ssql/v4\"\n)\n\nfunc main() {\n    // Create sample data\n    monthlyRevenue := []ssql.Record{\n        ssql.MakeMutableRecord().String(\"month\", \"Jan\").Float(\"revenue\", 120000).Freeze(),\n        ssql.MakeMutableRecord().String(\"month\", \"Feb\").Float(\"revenue\", 135000).Freeze(),\n        ssql.MakeMutableRecord().String(\"month\", \"Mar\").Float(\"revenue\", 118000).Freeze(),\n    }\n\n    data := slices.Values(monthlyRevenue)\n\n    // Generate interactive chart\n    ssql.QuickChart(data, \"month\", \"revenue\", \"revenue_chart.html\")\n    // Opens in browser with zoom, pan, and export features\n}\n```\n\n## 🎓 Learning Path\n\n**New to ssql?** We've got you covered with step-by-step guides:\n\n### 1. ⚡ **[CLI Tutorial](doc/cli-codelab.md)**\n*Prototype fast with Unix-style pipelines, generate production code*\n- Quick data exploration with command-line tools\n- Process system commands (ps, df, etc.)\n- Create visualizations with one command\n- Generate Go code from CLI pipelines\n- **Debug pipelines with jq** - [See debugging guide →](doc/cli-debugging.md)\n- **Perfect for rapid prototyping!**\n\n### 2. 📚 **[Getting Started Guide](doc/codelab-intro.md)**\n*Learn the Go library fundamentals with hands-on examples*\n- Basic operations (Select, Where, Limit)\n- Working with CSV/JSON/Arrow/XLSX data\n  - **⚠️ Note**: CSV auto-parses `\"25\"` → `int64(25)`, use correct types with `GetOr()`\n- Creating your first visualizations\n- Real-world examples\n\n### 2b. 📊 **[Signal Processing Guide](doc/cli-signal-processing.md)**\n*FFT, filtering, and GPU-accelerated analysis*\n- Frequency analysis with FFT/IFFT\n- Convolution for smoothing and edge detection\n- Cross-correlation for pattern matching\n- Optional GPU acceleration (10-100x speedup)\n\n### 3. 📖 **[API Reference](doc/api-reference.md)**\n*Complete function documentation with examples*\n- All operations organized by category\n- Transform, Filter, Aggregate, Join operations\n- Window processing for real-time data\n- Chart and visualization options\n\n### 4. 🎯 **[Advanced Tutorial](doc/advanced-tutorial.md)**\n*Master complex patterns and production techniques*\n- Stream joins and complex aggregations\n- Real-time processing with windowing\n- Infinite stream handling\n- Performance optimization\n\n### 5. 🤖 **[AI Code Generation](doc/ai-human-guide.md)**\n*Generate ssql code from natural language*\n- Use any AI assistant (Claude, ChatGPT, Gemini)\n- Describe what you want, get working code\n- Human-readable, verifiable results\n- Perfect for rapid prototyping\n- **For LLMs**: Copy [ai-code-generation.md](doc/ai-code-generation.md) into your LLM\n\n## 🔧 Core Capabilities\n\n### **SQL-Style Data Processing**\n\n**Quick view:**\n```go\n// Group sales by region, calculate totals, get top 5\ntopRegions := ssql.Chain(\n    ssql.GroupByFields(\"sales\", \"region\"),\n    ssql.Aggregate(\"sales\", aggregations),\n    ssql.SortBy(keyFunc),\n    ssql.Limit[ssql.Record](5),\n)(salesData)\n```\n\n\u003cdetails\u003e\n\u003csummary\u003e📋 \u003cb\u003eClick for complete, runnable code\u003c/b\u003e\u003c/summary\u003e\n\n```go\npackage main\n\nimport (\n    \"fmt\"\n    \"log\"\n    \"github.com/rosscartlidge/ssql/v4\"\n)\n\nfunc main() {\n    // Read sales data\n    salesData, err := ssql.ReadCSV(\"sales.csv\")\n    if err != nil {\n        log.Fatal(err)\n    }\n\n    // Define aggregations\n    aggregations := map[string]ssql.AggregateFunc{\n        \"total_revenue\": ssql.Sum(\"amount\"),\n        \"sale_count\":    ssql.Count(),\n    }\n\n    // Define sort key function\n    keyFunc := func(r ssql.Record) float64 {\n        return -ssql.GetOr(r, \"total_revenue\", 0.0) // Negative for descending\n    }\n\n    // Group sales by region, calculate totals, get top 5\n    topRegions := ssql.Chain(\n        ssql.GroupByFields(\"sales\", \"region\"),\n        ssql.Aggregate(\"sales\", aggregations),\n        ssql.SortBy(keyFunc),\n        ssql.Limit[ssql.Record](5),\n    )(salesData)\n\n    // Display results\n    fmt.Println(\"Top 5 Regions by Revenue:\")\n    for region := range topRegions {\n        name := ssql.GetOr(region, \"region\", \"\")\n        revenue := ssql.GetOr(region, \"total_revenue\", 0.0)\n        count := ssql.GetOr(region, \"sale_count\", int64(0))\n        fmt.Printf(\"%s: $%.2f (%d sales)\\n\", name, revenue, count)\n    }\n}\n```\n\n\u003c/details\u003e\n\n### **Real-Time Stream Processing**\n\n**Quick view:**\n```go\n// Process sensor data in 5-minute windows\nwindowed := ssql.TimeWindow[ssql.Record](5*time.Minute, \"timestamp\")(sensorStream)\nfor window := range windowed {\n    // Analyze each time window\n}\n```\n\n\u003cdetails\u003e\n\u003csummary\u003e📋 \u003cb\u003eClick for complete, runnable code\u003c/b\u003e\u003c/summary\u003e\n\n```go\npackage main\n\nimport (\n    \"fmt\"\n    \"log\"\n    \"time\"\n    \"github.com/rosscartlidge/ssql/v4\"\n)\n\nfunc main() {\n    // Read sensor data\n    sensorStream, err := ssql.ReadCSV(\"sensor_data.csv\")\n    if err != nil {\n        log.Fatal(err)\n    }\n\n    // Process sensor data in 5-minute windows\n    windowed := ssql.TimeWindow[ssql.Record](5*time.Minute, \"timestamp\")(sensorStream)\n\n    fmt.Println(\"Processing 5-minute windows:\")\n    for window := range windowed {\n        // Analyze each time window\n        count := len(window)\n\n        // Calculate average temperature\n        var totalTemp float64\n        for _, record := range window {\n            temp := ssql.GetOr(record, \"temperature\", 0.0)\n            totalTemp += temp\n        }\n        avgTemp := totalTemp / float64(count)\n\n        fmt.Printf(\"Window: %d readings, avg temp: %.2f°C\\n\", count, avgTemp)\n    }\n}\n```\n\n\u003c/details\u003e\n\n### **Interactive Dashboards**\n\n**Quick view:**\n```go\nconfig := ssql.DefaultChartConfig()\nconfig.Title = \"Sales Dashboard\"\nconfig.ChartType = \"line\"\nssql.InteractiveChart(data, \"dashboard.html\", config)\n```\n\n\u003cdetails\u003e\n\u003csummary\u003e📋 \u003cb\u003eClick for complete, runnable code\u003c/b\u003e\u003c/summary\u003e\n\n```go\npackage main\n\nimport (\n    \"log\"\n    \"slices\"\n    \"github.com/rosscartlidge/ssql/v4\"\n)\n\nfunc main() {\n    // Create sample sales data\n    salesData := []ssql.Record{\n        ssql.MakeMutableRecord().String(\"month\", \"Jan\").Float(\"revenue\", 120000).Freeze(),\n        ssql.MakeMutableRecord().String(\"month\", \"Feb\").Float(\"revenue\", 135000).Freeze(),\n        ssql.MakeMutableRecord().String(\"month\", \"Mar\").Float(\"revenue\", 145000).Freeze(),\n        ssql.MakeMutableRecord().String(\"month\", \"Apr\").Float(\"revenue\", 132000).Freeze(),\n    }\n\n    data := slices.Values(salesData)\n\n    // Create interactive dashboard\n    config := ssql.DefaultChartConfig()\n    config.Title = \"Sales Dashboard\"\n    config.ChartType = \"line\"\n    config.Width = 1200\n    config.Height = 600\n    config.EnableZoom = true\n    config.EnablePan = true\n\n    err := ssql.InteractiveChart(data, \"dashboard.html\", config)\n    if err != nil {\n        log.Fatalf(\"Failed to create chart: %v\", err)\n    }\n\n    log.Println(\"Dashboard created: dashboard.html\")\n}\n```\n\n\u003c/details\u003e\n\n### **Signal Processing**\n\n**Quick view:**\n```go\n// FFT analysis, filtering, and reconstruction\nspectrum, _ := ssql.FFTWithPhase(signal)\nreconstructed, _ := ssql.IFFT(spectrum.Magnitude, spectrum.Phase)\nsmoothed, _ := ssql.Convolve(signal, ssql.GaussianKernel(11, 2.0))\ncorr, _ := ssql.Correlate(signal1, signal2)  // Find pattern matches\n```\n\n\u003cdetails\u003e\n\u003csummary\u003e📋 \u003cb\u003eClick for complete, runnable code\u003c/b\u003e\u003c/summary\u003e\n\n```go\npackage main\n\nimport (\n    \"fmt\"\n    \"math\"\n    \"github.com/rosscartlidge/ssql/v4\"\n)\n\nfunc main() {\n    // Create sample signal: 10Hz + 25Hz sine waves\n    sampleRate := 100.0 // 100 samples per second\n    signal := make(ssql.Signal, 256)\n    for i := range signal {\n        t := float64(i) / sampleRate\n        signal[i] = math.Sin(2*math.Pi*10*t) + 0.5*math.Sin(2*math.Pi*25*t)\n    }\n\n    // FFT to find frequency components\n    spectrum, err := ssql.FFT(signal)\n    if err != nil {\n        panic(err)\n    }\n\n    // Find peak frequencies\n    fmt.Println(\"Top frequencies:\")\n    for i, mag := range spectrum.Magnitude {\n        if mag \u003e 50 { // Threshold for significant peaks\n            freq := spectrum.FrequencyBin(i, sampleRate)\n            fmt.Printf(\"  %.1f Hz: magnitude %.1f\\n\", freq, mag)\n        }\n    }\n\n    // Smooth with Gaussian kernel\n    smoothed, err := ssql.ConvolveSame(signal, ssql.GaussianKernel(11, 2.0))\n    if err != nil {\n        panic(err)\n    }\n    fmt.Printf(\"\\nSmoothed signal: %d points\\n\", len(smoothed))\n}\n```\n\n\u003c/details\u003e\n\n**CLI Usage:**\n```bash\n# FFT analysis\nssql from audio.csv | ssql fft -field amplitude -rate 44100 | ssql to table\n\n# Inverse FFT for signal reconstruction\nssql from spectrum.csv | ssql ifft -magnitude mag -phase phase | ssql to csv filtered.csv\n\n# Smoothing with convolution\nssql from sensor.csv | ssql convolve -field reading -kernel gaussian -size 11 -same\n\n# Cross-correlation to find patterns\nssql from signal.csv | ssql correlate -field reading -with template.csv\n```\n\n**Features:**\n- **FFT/IFFT** - Forward and inverse FFT for frequency analysis and signal reconstruction\n- **Convolution** - Signal filtering with built-in kernels (avg, gaussian, diff, laplacian, sobel)\n- **Correlation** - Cross-correlation and autocorrelation for pattern matching\n- **Pipeline Integration** - Works with ssql's record-based pipelines\n- **Works everywhere** - CPU implementations included, no special setup required\n\n**GPU Acceleration (optional):**\nSignal processing works out of the box using CPU. For large datasets, optional CUDA GPU acceleration provides 10-100x speedup. See [GPU installation instructions](#option-1b-cli-tool-with-gpu-acceleration-optional) for setup via Docker (recommended) or local CUDA toolkit.\n\nGPU is used automatically when available for FFT \u003e= 1024 points or convolution kernels \u003e= 64 points.\n\n### **Data Integration**\n\n**Quick view:**\n```go\n// Join customer and order data\ncustomerOrders := ssql.InnerJoin(\n    orderStream,\n    ssql.OnFields(\"customer_id\")\n)(customerStream)\n```\n\n\u003cdetails\u003e\n\u003csummary\u003e📋 \u003cb\u003eClick for complete, runnable code\u003c/b\u003e\u003c/summary\u003e\n\n```go\npackage main\n\nimport (\n    \"fmt\"\n    \"log\"\n    \"github.com/rosscartlidge/ssql/v4\"\n)\n\nfunc main() {\n    // Read customer data\n    customerStream, err := ssql.ReadCSV(\"customers.csv\")\n    if err != nil {\n        log.Fatal(err)\n    }\n\n    // Read order data\n    orderStream, err := ssql.ReadCSV(\"orders.csv\")\n    if err != nil {\n        log.Fatal(err)\n    }\n\n    // Join customer and order data\n    customerOrders := ssql.InnerJoin(\n        orderStream,\n        ssql.OnFields(\"customer_id\"),\n    )(customerStream)\n\n    // Display joined results\n    fmt.Println(\"Customer Orders:\")\n    for record := range customerOrders {\n        custName := ssql.GetOr(record, \"customer_name\", \"\")\n        orderID := ssql.GetOr(record, \"order_id\", \"\")\n        amount := ssql.GetOr(record, \"amount\", 0.0)\n        fmt.Printf(\"%s - Order %s: $%.2f\\n\", custName, orderID, amount)\n    }\n}\n```\n\n\u003c/details\u003e\n\n### **Distributed Processing**\n\n**Quick view:**\n```bash\n# Read a remote file via SSH with push-down filtering\nssql from ssh myserver /data/events.csv -- where -if status eq error | ssql to table\n\n# Read multiple shards from a catalog CSV with partition pruning\nssql from catalog shards.csv -if date ge 2025-03-01 | ssql group-by service -count n\n```\n\n\u003cdetails\u003e\n\u003csummary\u003eClick for more examples\u003c/summary\u003e\n\n```bash\n# Multi-step push-down: filter and aggregate on each remote shard\nssql from ssh myserver /data/events.csv \\\n  -- where -if status ge 400 + group-by service -count cnt | \\\n  ssql to table\n\n# Catalog with range pruning and two-level aggregation\nssql from catalog shards.csv -if date ge 2025-02-01 \\\n  -- where -if status ge 400 + group-by service -count cnt | \\\n  ssql group-by service -sum cnt total_errors | \\\n  ssql to table\n\n# Add provenance to track which shard each record came from\nssql from catalog shards.csv -shard-field _shard | ssql to table\n\n# Use ssql_gpu on remote hosts\nssql from ssh myserver /data/events.csv -gpu | ssql to table\n```\n\n**Features:**\n- **`from ssh`** - Read remote files via SSH, push-down filters to reduce transfer\n- **`from catalog`** - Read multiple shards from a catalog CSV mapping hosts to file paths\n- **Partition pruning** - Skip irrelevant shards using range (`X_from`/`X_to`) or exact-value metadata\n- **Push-down** - Send filter and aggregation stages to remote hosts with `--` separator\n- **Local shards** - Catalog entries with `host=local` or `host=localhost` are read directly\n- **Code generation** - `from ssh` supports `-generate` / `SSQLGO=1`\n- **Pipeline optimizer** - `generate ssql` automatically pushes filters into SSH/catalog, collapses sort+limit to top, prunes Parquet columns, and more (12 optimization rules)\n\n\u003c/details\u003e\n\n### **Expression Support** ⚡\n\n**Quick view:**\n```bash\n# Calculate derived fields with expressions\nssql update -set-expr total 'price * qty'\nssql update -set-expr tier 'revenue \u003e 10000 ? \"gold\" : \"silver\"'\n\n# Complex filtering with boolean expressions\nssql where -expr 'age \u003e= 18 and status == \"active\"'\n```\n\n\u003cdetails\u003e\n\u003csummary\u003e📋 \u003cb\u003eClick for complete, runnable code and features\u003c/b\u003e\u003c/summary\u003e\n\nssql supports powerful expression evaluation for computed fields and complex filters using the [expr-lang](https://expr-lang.org/) library.\n\n**CLI Examples:**\n```bash\n# Calculated fields\necho 'name,price,qty\nWidget,10.50,3\nGadget,25.00,2' | ssql from | \\\n  ssql update -set-expr total 'price * qty' | \\\n  ssql update -set-expr discount 'total \u003e 50 ? total * 0.1 : 0'\n\n# Complex filtering\necho 'name,age,email,status\nAlice,30,alice@example.com,active\nBob,17,bob@example.com,pending\nCarol,25,carol@example.com,active' | ssql from | \\\n  ssql where -expr 'age \u003e= 18 and status == \"active\" and has(\"email\")'\n\n# String manipulation\necho 'email\n  ALICE@EXAMPLE.COM\nbob@test.com' | ssql from | \\\n  ssql update -set-expr email 'lower(trim(email))'\n```\n\n**Library Examples:**\n```go\npackage main\n\nimport (\n    \"fmt\"\n    \"log\"\n    \"github.com/rosscartlidge/ssql/v4\"\n    \"github.com/rosscartlidge/ssql/v4/cmd/ssql/lib/runtime\"\n)\n\nfunc main() {\n    // Read sales data\n    sales, err := ssql.ReadCSV(\"sales.csv\")\n    if err != nil {\n        log.Fatal(err)\n    }\n\n    // Compile expression once\n    calcTotal := runtime.MustCompileExpr(\"price * qty\")\n\n    // Apply to all records\n    updated := ssql.Update(func(mut ssql.MutableRecord) ssql.MutableRecord {\n        frozen := mut.Freeze()\n        result, _ := calcTotal(frozen)\n        if total, ok := result.(float64); ok {\n            return mut.Float(\"total\", total)\n        }\n        return mut\n    })(sales)\n\n    // Process results\n    for record := range updated {\n        total := ssql.GetOr(record, \"total\", 0.0)\n        fmt.Printf(\"Total: $%.2f\\n\", total)\n    }\n}\n```\n\n**Features:**\n- **30+ built-in functions** - Math (round, abs, min, max), string (upper, lower, trim, split), array (filter, map, sum), and type conversion\n- **All operators** - Arithmetic (`+`, `-`, `*`, `/`, `%`, `**`), comparison (`==`, `!=`, `\u003c`, `\u003e`, `\u003c=`, `\u003e=`), logical (`and`, `or`, `not`)\n- **Advanced syntax** - Ternary operator (`? :`), nil coalescing (`??`), membership (`in`), pipe (`|`)\n- **Helper functions** - `has(field)` check existence, `getOr(field, default)` safe access with defaults\n- **High performance** - Compile once (~100µs), evaluate many (~1-2µs per record)\n- **Type safety** - Boolean expressions type-checked at compile time\n- **Code generation** - Expressions pre-compiled in generated Go programs\n\n**Use Cases:**\n- **Data validation** - `where -expr 'age \u003e= 0 and age \u003c= 120 and has(\"email\")'`\n- **Data cleaning** - `update -set-expr email 'lower(trim(email))'`\n- **Calculations** - `update -set-expr total 'round(price * qty * (1 - discount / 100))'`\n- **Categorization** - `update -set-expr tier 'revenue \u003e 10000 ? \"gold\" : \"silver\"'`\n- **Complex filters** - `where -expr '(age \u003e= 18 and status == \"active\") or role == \"admin\"'`\n\n**Performance:**\n```bash\n# CLI execution (~1ms overhead for 1M records)\nssql from huge.csv | ssql where -expr 'price * qty \u003e 1000'\n\n# Code generation (10-100x faster, zero compilation overhead)\nexport SSQLGO=1\nssql from huge.csv | \\\n  ssql where -expr 'price * qty \u003e 1000' | \\\n  ssql update -set-expr total 'price * qty' | \\\n  ssql generate go \u003e optimized.go\ngo run optimized.go\n```\n\n**Full documentation:** [Expression Language Reference](doc/EXPRESSIONS.md)\n\n\u003c/details\u003e\n\n## 🎨 Try the Examples\n\nRun these to see ssql in action:\n\n```bash\n# Interactive chart showcase\ngo run examples/chart_demo.go\n\n# Data analysis pipeline\ngo run examples/functional_example.go\n\n# Real-time processing\ngo run examples/early_termination_example.go\n```\n\n## 🌟 Why Choose ssql?\n\n- **🎯 Simple API** - If you know SQL, you know ssql\n- **🔒 Type Safe** - Go generics catch errors at compile time\n- **📊 Visual** - Create charts as easily as processing data\n- **🤖 AI Ready** - Generate code from descriptions\n- **⚡ Performance** - Lazy evaluation and memory efficiency\n- **🔄 Composable** - Build complex pipelines from simple operations\n- **🔍 Debuggable** - JSONL streaming works with jq and Unix tools\n\n## 🎯 Perfect For\n\n- **Data Scientists** - Analyze CSV/JSON/Arrow/XLSX files with ease\n- **DevOps Engineers** - Monitor systems and create dashboards\n- **Business Analysts** - Generate reports and visualizations\n- **Developers** - Build ETL pipelines and data processing tools\n- **Anyone** - Who wants to turn data descriptions into working code\n\n## 🚀 What's Next?\n\n1. **[Install ssql](#installation)** and try the quick start\n2. **[Try the CLI](doc/cli-codelab.md)** for rapid prototyping *(in development)*\n3. **[Follow the Getting Started Guide](doc/codelab-intro.md)** for library fundamentals\n4. **[Try the AI Assistant](doc/ai-human-guide.md)** for code generation\n5. **[Explore Advanced Patterns](doc/advanced-tutorial.md)** for production use\n\n## 📚 Documentation\n\n**[All documentation →](doc/README.md)** | **[Research \u0026 design docs →](doc/research/README.md)**\n\n- **[CLI Tutorial](doc/cli-codelab.md)** - Complete command-line guide\n- **[API Reference](doc/api-reference.md)** - Go library documentation\n- **[Typed Codelab](doc/typed-codelab.md)** - Hands-on tutorial for the `ssql/typed` package\n- **[Typed Reference](doc/typed-reference.md)** - `ssql/typed` high-performance struct API (15× faster, 34× less memory)\n- **[Debugging Pipelines](doc/cli-debugging.md)** - Debug with jq, inspect data, profile performance\n- **[Troubleshooting Guide](doc/cli-troubleshooting.md)** - Common issues and quick solutions\n- **[AI Code Generation](doc/ai-human-guide.md)** - Natural language to code\n\n## 🤝 Community\n\nssql is production-ready and actively maintained. Questions, issues, and contributions are welcome!\n\n- 📖 **Documentation**: Complete guides and API reference\n- 🤖 **AI Integration**: Generate code from natural language\n- 📊 **Visualization**: Interactive charts and dashboards\n- 🔧 **Examples**: Real-world usage patterns\n- 🔍 **Debugging**: jq integration for pipeline inspection\n\n---\n\n**Ready to transform how you process data?** [Get started now →](doc/codelab-intro.md)\n\n*ssql: Where data processing meets AI-powered development* ✨","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frosscartlidge%2Fssql","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frosscartlidge%2Fssql","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frosscartlidge%2Fssql/lists"}