{"id":15111461,"url":"https://github.com/clflushopt/eocene","last_synced_at":"2026-02-19T04:31:35.485Z","repository":{"id":254531094,"uuid":"846799024","full_name":"clflushopt/eocene","owner":"clflushopt","description":"Demo of the Volcano iterator model in Rust","archived":false,"fork":false,"pushed_at":"2024-08-25T20:07:59.000Z","size":29,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-10-10T22:46:39.190Z","etag":null,"topics":["databases","query-engine","query-execution-plan","volcano-model"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/clflushopt.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-08-24T01:52:33.000Z","updated_at":"2025-02-02T20:26:31.000Z","dependencies_parsed_at":"2024-08-24T05:23:49.755Z","dependency_job_id":"6bd043a4-c9b4-4c94-bcfd-97f6c6885748","html_url":"https://github.com/clflushopt/eocene","commit_stats":{"total_commits":7,"total_committers":1,"mean_commits":7.0,"dds":0.0,"last_synced_commit":"d386df5db48ca4945109af92f51805e409a35e44"},"previous_names":["clflushopt/eocene"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/clflushopt/eocene","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/clflushopt%2Feocene","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/clflushopt%2Feocene/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/clflushopt%2Feocene/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/clflushopt%2Feocene/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/clflushopt","download_url":"https://codeload.github.com/clflushopt/eocene/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/clflushopt%2Feocene/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29603053,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-19T04:01:40.476Z","status":"ssl_error","status_checked_at":"2026-02-19T04:01:12.960Z","response_time":117,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["databases","query-engine","query-execution-plan","volcano-model"],"created_at":"2024-09-26T00:20:22.314Z","updated_at":"2026-02-19T04:31:35.468Z","avatar_url":"https://github.com/clflushopt.png","language":"Rust","readme":"# Minimally viable query engine with the Volcano model\n\nThis is an implementation of a minimal query engine capable of executing\na subset of your usual SQL operators by following the Volcano model.\n\nThe Volcano model often also described as *the classical iterator model* \ninitially described in [Volcano - An Extensible and Parallel Query Evaluation System](https://dl.acm.org/doi/10.1109/69.273032)\nis a pipelined execution model that describes query execution as a pipeline\nof pull based operators, where each operators *pulls* rows from its parent by\ncalling a `next() -\u003e Row` method.\nWith this uniform interface for all operators Volcano effectively decouples\ninputs from operators.\n\nThe core idea is described beautifully in the section `Query Processing` from\nthe original paper :\n\n```\nIn Volcano, all algebra operators are implemented as iterators i.e. they support\na simple open-next-close protocol.\n\nBasically, iterators provide the iteration component of a loop, i.e. initialization\nincrement, loop termination condition, and final housekeeping.\n```\n\nAdrian Colyer has a well written article that summarizes the key point of\nthe original paper in his blog [the morning paper](https://blog.acolyer.org/2015/02/11/encapsulation-of-parallelism-in-the-volcano-query-processing-system/).\n\nThe pull based, or iterator based model is not without issue, the cost of\na clean interface is performance. Neumann et al. argue in [Efficiently Compiling Efficient Query Plans\nfor Modern Hardware](https://www.vldb.org/pvldb/vol4/p539-neumann.pdf) that\nthe pull based model while simplifies analysis and execution implementation\ncomes at the cost of performance.\n\nThe case for mechanical sympathy can be seen in in the fact that when processing\nmillions of rows, each operator `pull` incurs a function call either via dynamic\ndispatch or through a table using a function pointer which tend to compound when\nyou have millions of rows especially when it comes to branch mis-predictions.\n\nJust for demonstration purposes, I tried to approach the code generation part by\ndoing a small pass where I compile each operator into assembly using a runtime\nassembler for x86. The code which is largely non-functional can be found in [PR #1](https://github.com/clflushopt/eocene/pull/1)\n\n# Example\n\nThe code implements a small query engine with a SQL tokenizer and parser capable\nof representing very simple queries, the AST can then be passed to the query engine\nwhich will create a query plan in the form of a pipeline of operators before executing\nthem.\n\nPlease note that the code is not tested, most tests act just as sanity check that\nthe base logic is fine. There is no error handling and consideration for edge cases.\n\nWe currently implement the following operators :\n\n* Scan operator which is the starting point of the pipeline.\n* Projection operator which selects specific columns from each row.\n* Filter operator which runs predicates on rows returning only the ones that satisfy\n  the predicate.\n* Sort operator which returns rows in sorted order.\n* Join operator which implements *Nested Loop Join*.\n* Limit operator which sets a cut-off on the number of returned rows.\n\nBelow is the code in `main.rs` which runs some select queries.\n\n``` rust\n\nmacro_rules! query {\n    ($query_str:expr, $data:expr) =\u003e {{\n        let tokenizer = Tokenizer::new($query_str);\n        let q = Parser::new(tokenizer).parse();\n\n        let mut executor = QueryExecutor {};\n        let plan = executor.plan(q, $data);\n        QueryExecutor::execute(plan)\n    }};\n}\n\nfn main() {\n    // Example data\n    let data = vec![\n        Row::new(\u0026[\n            \"1\".to_string(),\n            \"Alice\".to_string(),\n            \"Manager\".to_string(),\n            \"12000\".to_string(),\n        ]),\n        Row::new(\u0026[\n            \"2\".to_string(),\n            \"Bob\".to_string(),\n            \"Developer\".to_string(),\n            \"10000\".to_string(),\n        ]),\n        Row::new(\u0026[\n            \"3\".to_string(),\n            \"Charlie\".to_string(),\n            \"Developer\".to_string(),\n            \"9000\".to_string(),\n        ]),\n        Row::new(\u0026[\n            \"4\".to_string(),\n            \"David\".to_string(),\n            \"Analyst\".to_string(),\n            \"11000\".to_string(),\n        ]),\n        Row::new(\u0026[\n            \"5\".to_string(),\n            \"Eve\".to_string(),\n            \"Manager\".to_string(),\n            \"13000\".to_string(),\n        ]),\n        Row::new(\u0026[\n            \"6\".to_string(),\n            \"Frank\".to_string(),\n            \"Developer\".to_string(),\n            \"9500\".to_string(),\n        ]),\n        Row::new(\u0026[\n            \"7\".to_string(),\n            \"Grace\".to_string(),\n            \"Analyst\".to_string(),\n            \"10500\".to_string(),\n        ]),\n        Row::new(\u0026[\n            \"8\".to_string(),\n            \"Hannah\".to_string(),\n            \"Developer\".to_string(),\n            \"9800\".to_string(),\n        ]),\n        Row::new(\u0026[\n            \"9\".to_string(),\n            \"Ivy\".to_string(),\n            \"Manager\".to_string(),\n            \"12500\".to_string(),\n        ]),\n        Row::new(\u0026[\n            \"10\".to_string(),\n            \"Jack\".to_string(),\n            \"Analyst\".to_string(),\n            \"10200\".to_string(),\n        ]),\n    ];\n\n    let queries = vec![\n        (\n            \"SELECT id FROM example WHERE name = 'Ivy' LIMIT 1\",\n            vec![Row::new(\u0026[\"9\".to_string()])],\n        ),\n        (\n            \"SELECT name FROM employees WHERE role = 'Developer'\",\n            vec![\n                Row::new(\u0026[\"Bob\".to_string()]),\n                Row::new(\u0026[\"Charlie\".to_string()]),\n                Row::new(\u0026[\"Frank\".to_string()]),\n                Row::new(\u0026[\"Hannah\".to_string()]),\n            ],\n        ),\n        (\n            \"SELECT id FROM employees WHERE salary \u003e 9000 LIMIT 3\",\n            vec![\n                Row::new(\u0026[\"1\".to_string()]),\n                Row::new(\u0026[\"2\".to_string()]),\n                Row::new(\u0026[\"4\".to_string()]),\n            ],\n        ),\n    ];\n    for query in queries {\n        let results = query!(query.0, data.clone());\n        let expected = query.1;\n        assert_eq!(results, expected);\n    }\n}\n\n```\n\n\n# License\n\nThe code is under an [MIT License](LICENSE).\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fclflushopt%2Feocene","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fclflushopt%2Feocene","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fclflushopt%2Feocene/lists"}