{"id":13837775,"url":"https://github.com/tchajed/database-stream-processing-theory","last_synced_at":"2025-06-17T05:04:42.781Z","repository":{"id":104275098,"uuid":"531564589","full_name":"tchajed/database-stream-processing-theory","owner":"tchajed","description":"Formalization of DBSP","archived":false,"fork":false,"pushed_at":"2023-08-22T12:54:43.000Z","size":49,"stargazers_count":12,"open_issues_count":0,"forks_count":2,"subscribers_count":4,"default_branch":"main","last_synced_at":"2024-05-01T16:32:45.264Z","etag":null,"topics":["lean"],"latest_commit_sha":null,"homepage":"","language":"Lean","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tchajed.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-09-01T14:52:07.000Z","updated_at":"2024-05-30T08:04:10.745Z","dependencies_parsed_at":null,"dependency_job_id":"89e3df2a-9e14-409c-9287-427b36bab68e","html_url":"https://github.com/tchajed/database-stream-processing-theory","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/tchajed/database-stream-processing-theory","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tchajed%2Fdatabase-stream-processing-theory","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tchajed%2Fdatabase-stream-processing-theory/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tchajed%2Fdatabase-stream-processing-theory/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tchajed%2Fdatabase-stream-processing-theory/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tchajed","download_url":"https://codeload.github.com/tchajed/database-stream-processing-theory/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tchajed%2Fdatabase-stream-processing-theory/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":260294455,"owners_count":22987622,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["lean"],"created_at":"2024-08-04T15:01:24.754Z","updated_at":"2025-06-17T05:04:42.754Z","avatar_url":"https://github.com/tchajed.png","language":"Lean","funding_links":[],"categories":["Lean"],"sub_categories":[],"readme":"# DBSP formalization\n\n[![CI](https://github.com/tchajed/database-stream-processing-theory/actions/workflows/lean_build.yml/badge.svg)](https://github.com/tchajed/database-stream-processing-theory/actions/workflows/lean_build.yml)\n\nLean formalization of the theory behind [DBSP](https://arxiv.org/abs/2203.16684), a\nlanguage for expressing incremental view maintenance for databases.\n\nDBSP can be divided into two parts: a general theory of operators over streams,\nand a specialization of that theory to implement relational algebra queries.\n\n## Defining the basic DBSP operators\n\n- [stream.lean](src/stream.lean) defines infinite streams over an arbitrary type `a` as `ℕ → a`.\n- [operators.lean](src/operators.lean) defines the notion of an operator (a\n  function between streams) and properties of operators (like causality,\n  strictness). It defines three core operators: the pointwise lifting of a\n  function, the delay operator `z⁻¹`, and a general fixpoint construction for\n  constructing a stream recursively.\n- [linear.lean](src/linear.lean) defines the differentiation and integration\n  operators for streams over an arbitrary group, and the associated property of\n  linearity.\n- [incremental.lean](src/incremental.lean) defines the core DBSP idea of the\n  incrementalization `Q^Δ` of an operator `Q`, defind as `D ∘ Q ∘ I`. It also\n  has proofs of an equational theory of incrementalization.\n- [circuits.lean](src/circuits.lean) defines a \"circuit\", which is a restricted\n  language for defining operators. We can define and prove correct a general\n  algorithm for incrementalizing and optimizing any circuit, and thus any operator\n  expressible as a circuit.\n\n## Relational algebra in DBSP\n\n- [zset.lean](src/zset.lean) defines Z-sets, a generalization of multisets used\n  to model relations and changes to relations. A Z-set `Z[A]` over a type `A` is\n  a function from `A` to integers `ℤ` with finite support: only finitely many\n  elements map to non-zero integers.\n- [relational.lean](src/relational.lean) defines versions of the basic\n  relational operators over Z-sets, and proves that they implement the set\n  versions of the operators in an appropriate sense. This includes the\n  DBSP-specific `distinct` operator, used to convert a Z-set to a set.\n- [relational_incremental.lean](src/relational_incremental.lean) proves some\n  rules for the incremental version of relational operators (perhaps most\n  interestingly, of the lifted distinct operator).\n- [relational_example.lean](src/relational_example.lean) is a self-contained\n  file that works through an example of optimizing a relational query and its\n  incremental version.\n- [stream_elim.lean](src/stream_elim.lean) defines stream introduction and\n  elimination functions `δ0` and `∫`. Stream elimination is complicated because\n  it is only computable for streams that are zero almost everywhere.\n- [recursive.lean](src/recursive.lean) defines and proves the correctness of a\n  circuit that implements the recursive version of a relational query. This uses\n  the stream introduction and elimination functions to create a new time domain,\n  which we can think of as successive iterations of the recursion (rather than\n  the usual notion of time).\n- [aggregation.lean](src/aggregation.lean) defines the count and sum\n  aggregations, which go from a Z-set to a number.\n\n## Contributing\n\nThe DBSP team welcomes contributions from the community. Before you start working on this project, please\nread our [Developer Certificate of Origin](https://cla.vmware.com/dco). All contributions to this repository must be\nsigned as described on that page. Your signature certifies that you wrote the patch or have the right to pass it on\nas an open-source patch. For more detailed information, refer to [CONTRIBUTING.md](CONTRIBUTING.md).\n\n## License\n\nCopyright 2022-2023 VMware, Inc.\n\nSPDX-License-Identifier: BSD-2-Clause\n\nSee [NOTICE](NOTICE) and [LICENSE](LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftchajed%2Fdatabase-stream-processing-theory","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftchajed%2Fdatabase-stream-processing-theory","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftchajed%2Fdatabase-stream-processing-theory/lists"}