{"id":22677924,"url":"https://github.com/puchaczov/musoq","last_synced_at":"2026-03-13T21:04:01.421Z","repository":{"id":28819818,"uuid":"119426213","full_name":"Puchaczov/Musoq","owner":"Puchaczov","description":"SQL Syntax without any database","archived":false,"fork":false,"pushed_at":"2025-05-07T15:29:00.000Z","size":16482,"stargazers_count":481,"open_issues_count":0,"forks_count":21,"subscribers_count":8,"default_branch":"master","last_synced_at":"2025-05-16T14:06:28.919Z","etag":null,"topics":["ai-assisted-queries","cross-platform","csharp","csv","data-analysis-sql","data-exploration","data-processing","dotnet","dotnet-core","dotnetcore","file-system","plugin-architecture","query-language","sql","text-processing"],"latest_commit_sha":null,"homepage":"https://puchaczov.github.io/Musoq/","language":"C#","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Puchaczov.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2018-01-29T18:55:08.000Z","updated_at":"2025-05-07T15:29:05.000Z","dependencies_parsed_at":"2024-02-04T00:22:46.303Z","dependency_job_id":"697cda13-e2a7-4157-941a-6ff6a44520f9","html_url":"https://github.com/Puchaczov/Musoq","commit_stats":{"total_commits":366,"total_committers":7,"mean_commits":"52.285714285714285","dds":"0.29508196721311475","last_synced_commit":"f8eeed7e5d8b78208fc6396d40f9700dfe536e44"},"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Puchaczov%2FMusoq","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Puchaczov%2FMusoq/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Puchaczov%2FMusoq/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Puchaczov%2FMusoq/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Puchaczov","download_url":"https://codeload.github.com/Puchaczov/Musoq/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254544146,"owners_count":22088807,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-assisted-queries","cross-platform","csharp","csv","data-analysis-sql","data-exploration","data-processing","dotnet","dotnet-core","dotnetcore","file-system","plugin-architecture","query-language","sql","text-processing"],"created_at":"2024-12-09T18:13:12.966Z","updated_at":"2026-03-10T00:06:44.324Z","avatar_url":"https://github.com/Puchaczov.png","language":"C#","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003c!-- \n(Best practice: if you have a logo, place it here centered)\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"images/musoq-logo.png\" alt=\"Musoq Logo\" width=\"200\"/\u003e\n\u003c/p\u003e \n--\u003e\n\n```text\n  ███╗   ███╗██╗   ██╗███████╗ ██████╗  ██████╗ \n  ████╗ ████║██║   ██║██╔════╝██╔═══██╗██╔═══██╗\n  ██╔████╔██║██║   ██║███████╗██║   ██║██║   ██║\n  ██║╚██╔╝██║██║   ██║╚════██║██║   ██║██║▄▄ ██║\n  ██║ ╚═╝ ██║╚██████╔╝███████║╚██████╔╝╚██████╔╝\n  ╚═╝     ╚═╝ ╚═════╝ ╚══════╝ ╚═════╝  ╚══▀▀═╝ \n        SQL Superpowers for Developers\n```\n\n# Musoq: SQL Superpowers for Developers\n\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Maintenance](https://img.shields.io/badge/Maintained%3F-yes-green.svg)](https://github.com/Puchaczov/Musoq/graphs/code-frequency)\n[![Nuget](https://img.shields.io/badge/Nuget%3F-yes-green.svg)](https://www.nuget.org/packages?q=musoq)\n![Tests](https://raw.githubusercontent.com/puchaczov/musoq/badges/docs/assets/tests-badge.svg)\n\n\n**Ad-hoc SQL queries against files, logs, processes, and more — with zero data ingestion or intermediate storage.**\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"images/musoq-demo.gif\" alt=\"Musoq CLI animated demo showing querying OS files, CSV data, and Git history\" width=\"800\"/\u003e\n\u003c/p\u003e\n\nMusoq is built for **ad-hoc querying and investigation** — the moments when you want to ask a quick question about data that isn't already in a database: a log file, a git history, a binary dump, a CSV, a running process. The kind of question where writing a script feels like too much overhead, but `grep` alone isn't enough.\n\nInstead of a script, you write a query.\n\n*Musoq extends standard SQL with declarative inline **text parsing**, **binary decoding**, and cross-source data joins — defined directly inside the query.*\n\n📚 **[Read the Full Documentation](https://github.com/Puchaczov/Musoq/wiki)** *(Or run `Musoq --help` in your terminal)*\n\n## Table of Contents\n\n- [The Motivation: Bash vs. SQL](#-the-motivation-bash-vs-sql)\n- [Quick Start \u0026 Installation](#-quick-start--installation)\n- [Beyond Standard SQL](#-beyond-standard-sql)\n  - [Inline Binary Decoding](#1-inline-binary-decoding)\n  - [Declarative Text Log Parsing](#2-declarative-text-log-parsing)\n  - [Strong Typing for Dynamic Data](#3-strong-typing-for-dynamic-data-table--couple)\n- [The Developer Toolbox](#-the-developer-toolbox-beyond-ad-hoc-queries)\n- [How Musoq Fits in the Ecosystem](#-how-does-musoq-fit-into-the-sql-tooling-ecosystem)\n- [A Universe of Data Sources](#-available-data-sources)\n- [Ecosystem Architecture](#-the-musoq-ecosystem)\n\n---\n\n## 💡 The Motivation: Bash vs. SQL\n\nInstead of maintaining a fragile chain of Bash commands:\n```bash\nfind . -name \"*.js\" -exec wc -l {} \\; | awk '{sum+=$1} END {print sum}'\n```\n\nWrite declarative, readable SQL:\n```sql\nselect Sum(Length(f.GetFileContent())) as TotalLines\nfrom os.files('.', true) f\nwhere f.Extension = '.js'\n```\n\n---\n\n## 🚀 Quick Start \u0026 Installation\n\nTo actually execute Musoq queries locally, you need the CLI application. Since Musoq by itself is just the engine, the CLI and Server handles compiling your query and returning formatted results (in tables, JSON, CSV, Yaml, etc.).\n\n### 1. Install CLI\n*(no additional dependencies required)*\n\n**Powershell (Windows)**\n```powershell\nirm https://raw.githubusercontent.com/Puchaczov/Musoq.CLI/refs/heads/main/scripts/powershell/install.ps1 | iex\n```\n\n**Shell using curl (Linux / macOS)**\n```shell\ncurl -fsSL https://raw.githubusercontent.com/Puchaczov/Musoq.CLI/refs/heads/main/scripts/bash/install.sh | sudo bash\n```\n\n*(Prefer a manual install? Download the standalone binary from our [Releases](https://github.com/Puchaczov/Musoq.CLI/releases) page.)*\n\n### 2. Install Data Sources\nMusoq is highly modular. You install data sources via the built-in registry to unlock new tables and schemas.\n\n```bash\nMusoq datasource install os --registry\nMusoq datasource install git --registry\nMusoq datasource install separatedvalues --registry\n```\n\n### 3. Run your first queries\nOpen a terminal, start a background server, and fire away!\n```bash\n# 1. Start the local agent server\nMusoq serve\n\n# 2. Who is consuming all the space?\nMusoq run \"select Name, Length from os.files('/home', true) order by Length desc take 10\"\n\n# 3. Look at recent commits\nMusoq run \"select c.Sha, c.Message, c.Author from git.repository('.') r cross apply r.Commits c\"\n\n# 4. Stop the server when done\nMusoq quit\n```\n\n---\n\n## ✨ Beyond Standard SQL\n\nMusoq doesn't just read tables; it **understands raw data formats inline**. You don't need a custom plugin to query weird file formats if you can describe them.\n\n*Querying standard CSVs and JSON is easy, but Musoq's real power is understanding raw data formats...*\n\n### 1. Inline Binary Decoding (`binary` schemas)\nReading a custom binary file usually means opening a hex editor or writing a C# `BinaryReader` wrapper. With Musoq, you declare the binary layout right above your query:\n\n```sql\n-- Declare your binary struct right in the script!\nbinary GameSaveHeader {\n    Magic:    int le,\n    Version:  short le,\n    PlayerId: byte[16],\n    Score:    int le\n}\n\n-- Query the raw bytes from the file using the declaration\nselect \n    h.Version, \n    ToHex(h.PlayerId) as UID, \n    h.Score \nfrom os.file('/saves/save1.dat') f\ncross apply Interpret(f.GetBytes(), GameSaveHeader) h\nwhere h.Magic = 0x4D414745 -- 'GAME'\n```\n\n### 2. Declarative Text Log Parsing (`text` schemas)\nParsing a badly formatted application log without Musoq usually means chaining regex patterns that are hard to read and harder to maintain. Musoq lets you describe the structure inline instead:\n\n```sql\n-- Describe what the log looks like inline\ntext LogEntry {\n    Timestamp: between '[' ']',\n    _:         literal ' ',\n    Level:     until ':',\n    _:         literal ': ',\n    Message:   rest\n}\n\n-- Stream it, parse it, query it!\nselect log.Timestamp, log.Level, log.Message\nfrom os.file('/var/logs/app.log') f\ncross apply Lines(f.GetContent()) line\ncross apply Parse(line.Value, LogEntry) log\nwhere log.Level = 'ERROR'\n```\n\n### 3. Strong Typing for Dynamic Data (`table` \u0026 `couple`)\nCSV, JSON, and LLMs often return untyped string data. Musoq lets you define strong types and \"couple\" them with dynamic datasources to enforce sanity:\n\n```sql\ntable Receipt {\n    Shop: string,\n    ProductName: string,\n    Price: decimal\n};\n\n-- Bind untyped AI-vision extraction output to strict SQL Types\ncouple stdin.LlmExtractFromImage() with table Receipt as SourceOfReceipts;\n\nselect s.Shop, s.ProductName, s.Price \nfrom SourceOfReceipts('OpenAi', 'gpt-4o') s\nwhere s.Price \u003e 100.00\n```\n\n---\n\n## 🧰 The Developer Toolbox: Supporting Ad-Hoc Workflows\n\nThe CLI is designed around the ad-hoc investigation workflow — quickly reaching for data, shaping it, and moving on. Beyond one-liners, it also supports saving frequent queries as tools and exposing them to AI agents.\n\n### 1. First-Class `stdin` Piping\nYou don't always want to query files on disk. Musoq has native support for intercepting streamed `stdin` data and structuring it on the fly using zero-copy memory-mapped buffers:\n\n```bash\n# Query JSON output from other CLI tools instantly\nkubectl get pods -o json | musoq run \"select * from stdin.JsonFlat() where path like '%status.phase' and value = 'Running'\"\n\n# Apply regex directly to a live command stream\ncat app.log | musoq run \"select * from stdin.Regex('(?\u003ctimestamp\u003e.*?)\\\\s+(?\u003clevel\u003e.*?)\\\\s+(?\u003cmessage\u003e.*)') where level = 'ERROR'\"\n```\n\n### 2. Parameterized Tools\nInstead of retyping complex queries, you can save them as **Tools** using YAML and Scriban templates. \n\n```bash\n# Execute a saved tool with dynamic arguments\nMusoq tool execute search_commits --author \"John Doe\" --since \"2024-01-01\"\n```\n\n### 3. Native Model Context Protocol (MCP) Server\nBy enabling the built-in MCP server (`musoq set mcp-enabled true`), Musoq exposes your parameterized tools as callable functions to AI agents like Claude, Cursor, or GitHub Copilot. \n\nYou can create isolated \"Contexts\" so your AI assistant can safely query your active git history, search local file hierarchies, or parse your API responses using SQL, without writing any integration code.\n\n---\n\n## 🆚 How does Musoq fit into the SQL tooling ecosystem?\n\nThere are several excellent tools that allow you to use SQL outside of traditional databases. While they share a similar syntax, they are fundamentally designed to solve different classes of problems:\n\n| Tool | Primary Focus | Best Suited For |\n|---|---|---|\n| **DuckDB** | Analytical Workloads (OLAP) | Aggregating and analyzing large, structured datasets (Parquet, CSV, JSON) at extremely high speeds. |\n| **Steampipe** | Cloud Infrastructure | Querying cloud APIs (AWS, Azure, GitHub) as foreign tables for compliance, auditing, and DevSecOps. |\n| **osquery** | Endpoint Monitoring | Tracking the state, metrics, and security configurations across fleets of operating systems. |\n| **Musoq** | Ad-hoc Querying \u0026 Investigation | One-off queries, debugging sessions, and local investigations against files, logs, binary data, and `stdin` — without importing or storing anything. |\n\nWhile tools like DuckDB and Steampipe excel when data is already naturally structured or API-driven, Musoq is built for the investigative, exploratory side of development — when you don't know the shape of the data yet and you want to ask questions first. It gives you the primitives (inline `text` matchers, `binary` schemas, and AI `couple` statements) to define structure *during* the query, not before it.\n\nImportantly, **Musoq does not use an underlying database engine** (like SQLite or Postgres FDWs). There is no \"import\" step, no data ingestion, and no intermediate storage. Musoq is a pure runtime that streams and transforms data exactly where it resides—whether that's a file on disk, an API response, or `stdin`—and outputs the result directly.\n\n---\n\n## 🔌 Available Data Sources\n\nYou can query APIs, files, and services as logical tables using our growing library of [Musoq Data Sources](https://github.com/Puchaczov/Musoq.DataSources):\n\n- **Development**: C# Code Analysis (Roslyn), Git (tags, diffs, line history)\n- **Infrastructure**: Docker (containers, images, logs), Kubernetes, System OS\n- **Files**: JSON, CSV, Archives (Zip/Tar), Flat files\n- **AI \u0026 Integrations**: OpenAI/Ollama (Unstructured extractions!), Airtable, CANBus\n- **Databases**: Postgres, SQLite\n\n*(Tip: Just run `desc schema` or `desc schema.table(args)` inside Musoq to explore what is queryable.)*\n\n---\n\n## 🧩 The Musoq Ecosystem\n\nMusoq is highly modular and built with extensibility at its core. Here is how the components interact:\n\n```mermaid\nflowchart TD\n    User([User / Terminal]) --\u003e CLI\n    \n    subgraph Musoq.AgentLocal [Musoq Server \u0026 CLI ecosystem]\n        CLI[Musoq CLI]\n        LocalHost[(Local Server)]\n        CLI \u003c--\u003e|JSON / Pipes| LocalHost\n    end\n\n    subgraph Core [Engine]\n        Engine[Musoq Engine]\n        LocalHost --\u003e|Compiles Query| Engine\n    end\n    \n    subgraph Plugins [Musoq.DataSources]\n        DS_OS[OS Files]\n        DS_Git[Git Repos]\n        DS_AI[OpenAI / LLMs]\n        Engine --\u003e|Requests Data| DS_OS\n        Engine --\u003e|Requests Data| DS_Git\n        Engine --\u003e|Requests Data| DS_AI\n    end\n```\n\nIt is divided into 3 key projects:\n\n1. **[Musoq](https://github.com/Puchaczov/Musoq)** (You are here): The core MIT-licensed SQL engine language and AST runtime. Designed to be extended with new data sources.\n2. **[Musoq.DataSources](https://github.com/Puchaczov/Musoq.DataSources)**: The MIT-licensed repository containing all the plugins (Git, OS, Postgres, OpenAI, Archives).\n3. **[Musoq.CLI \u0026 Musoq.AgentLocal](https://github.com/Puchaczov/Musoq.CLI)**: A lightweight background server \u0026 CLI that executes the Musoq query language locally. Not yet open sourced, but free to use.\n\n### Deep Dive: Engine Architecture\n\nWhen a query enters the core **Musoq Engine**, it goes through the following pipeline:\n\n```mermaid\nflowchart TD\n    SQL[/SQL Query String/] --\u003e Parser\n    \n    subgraph Engine [Core Engine Internal Pipeline]\n        direction TB\n        Parser[Lexer \u0026 Parser] --\u003e AST[Abstract Syntax Tree]\n        AST --\u003e Visitors[AST Visitors \u0026 Rewriters]\n        Visitors --\u003e Semantic[Type Inference \u0026 Semantic Analysis]\n        Semantic --\u003e Compiler[C# Code Generator \u0026 Compiler]\n        Compiler --\u003e Runtime[Execution Runtime VM]\n    end\n    \n    Registry[(Plugin / Schema Registry)] -.-\u003e|Injects types \u0026 methods| Semantic\n    Runtime \u003c--\u003e|Streams Data Row-by-Row| DataSource[(Data Source Plugin)]\n    Runtime ===\u003e Results[/Tabular Result Set/]\n```\n\n## 🤖 Extensibility \u0026 AI-Driven Agent Plugins\nYou can write C# or Python plugins manually, or point an AI agent at the plugin development guide and have it build one for you.\n\nWe provide a dedicated, self-contained guide designed explicitly for Autonomous Coding Agents (like GitHub Copilot, Cursor, or Claude) to build, test, package, and deploy complete .NET plugins without human intervention. Just point your agent at the docs and tell it what data source you want!\n\nCheck out the [🤖 Autonomous Plugin Development Guide (in Musoq.DataSources)](https://github.com/Puchaczov/Musoq.DataSources/blob/main/MusoqAutonomousPluginDevelopment.md) to bootstrap your first AI-generated plugin.\n\n---\n\n*\"Why write loops, when you can write queries?\"*\n\n---\n\n## 📜 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. This means Musoq is free for both non-commercial and commercial use.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpuchaczov%2Fmusoq","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpuchaczov%2Fmusoq","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpuchaczov%2Fmusoq/lists"}