{"id":24269790,"url":"https://github.com/darkleaf/hazel","last_synced_at":"2025-09-13T15:43:33.350Z","repository":{"id":271599923,"uuid":"902997012","full_name":"darkleaf/hazel","owner":"darkleaf","description":"POC exploring adaptation of Datomic principles for the frontend 🤯","archived":false,"fork":false,"pushed_at":"2025-01-28T15:19:24.000Z","size":97,"stargazers_count":31,"open_issues_count":1,"forks_count":1,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-04-06T11:24:15.049Z","etag":null,"topics":["datascript","datomic","local-first"],"latest_commit_sha":null,"homepage":"","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/darkleaf.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-12-13T17:51:08.000Z","updated_at":"2025-03-22T21:46:40.000Z","dependencies_parsed_at":"2025-01-08T18:47:13.634Z","dependency_job_id":"8e448602-9b44-4db8-b174-35ba2ec10925","html_url":"https://github.com/darkleaf/hazel","commit_stats":null,"previous_names":["darkleaf/hazel"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/darkleaf/hazel","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/darkleaf%2Fhazel","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/darkleaf%2Fhazel/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/darkleaf%2Fhazel/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/darkleaf%2Fhazel/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/darkleaf","download_url":"https://codeload.github.com/darkleaf/hazel/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/darkleaf%2Fhazel/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":264222002,"owners_count":23575151,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["datascript","datomic","local-first"],"created_at":"2025-01-15T15:08:12.033Z","updated_at":"2025-07-08T07:34:01.875Z","avatar_url":"https://github.com/darkleaf.png","language":"JavaScript","readme":"# Hazel 🌳\n\nThis proof of concept (POC) investigates how [Datomic](https://www.datomic.com/) principles can be adapted for the frontend environment.\nIt introduces a peer library, written in JavaScript, that is capable of navigating a [DataScript](https://github.com/tonsky/datascript/) database and storing its segments in the browser's cache.\n\nThis approach is useful for productivity tools like Asana, Jira, Slack, and Notion,\nespecially for applications that work with relatively large databases in the browser.\nIt is necessary to make fast queries on that data without access to the backend.\n\nThe project is built upon the React TodoMVC framework.\n\nThe project's capabilities include:\n\n- Range queries.\n- Lazy iteration over databases by fetching database segments on-demand from the backend.\n- Long-term storage of these segments within the browser cache.\n- Getting a consistent snapshot of a database.\n\nFor those unfamiliar with Datomic-like databases, here's an illustrative example of its core concept:\n\nConsider a traditional database such as PostgreSQL or MySQL,\nwhere data resides on the server's disk and your application's queries are processed on the server.\nIf you decide to cache a slow query's result within your application,\nyou essentially forego the query engine, reducing it to key-value (KV) storage.\nIn contrast, a Datomic-like system enables you to execute queries within your application utilizing its cache.\n\nMore info you can find in [Datomic Introduction](https://docs.datomic.com/datomic-overview.html).\n\n# How does it work?\n\n***Hazel*** is designed to read indexes built by [**DataScript**](https://github.com/tonsky/datascript/).\nHowever, unlike DataScript, it provides an asynchronous API for data querying and loads storage segments on-demand.\n\nYou should first be familiar with the **DataScript** or **Datomic** data model. If not, please refer to the following resources:\n\n- [Information Model](https://docs.datomic.com/datomic-overview.html#information-model)\n- [Indexes](https://docs.datomic.com/datomic-overview.html#indexes)\n\n## Datoms and Indexes\n\nThe **DataScript database** operates on **datoms**, which are atomic units of data. Each Datom is represented as a tuple `(E, A, V)`, where:\n\n- `E` stands for the **entity ID**,\n- `A` for the **attribute**, and\n- `V` for the **value**.\n\nThese elements form the core structure of a Datom. To efficiently organize datoms, DataScript uses three indexes: **EAV**, **AEV**, and **AVE**. The name of each index reflects the order in which datoms are sorted:\n\n- **EAV**: Sorted by entity ID, then attribute, then value.\n- **AEV** and **AVE**: Follow analogous patterns.\n\nWhile datoms in **Datomic** and **DataScript** also include an additional element `T` (Transaction ID), Hazel simplifies the model by excluding this component.\n\n## Index Implementation\n\nThe **indexes in DataScript** are implemented as **Persistent Sorted Sets**, a type of immutable data structure based on **B+ trees**. These structures are optimized for storing elements in sorted order and enable efficient operations such as lookups, insertions, and deletions, with a time complexity of $$O(\\log n)$$. Functional immutability is achieved through **structural sharing**, ensuring that updates reuse existing data whenever possible. A detailed explanation of B-trees, including their variation B+ trees, can be found in the paper [\"The Ubiquitous B-Tree\"](https://carlosproal.com/ir/papers/p121-comer.pdf) by Douglas Comer.\n\nEach node of the tree corresponds to a **storage segment**, serialized and stored persistently. **Branch nodes** contain keys and addresses for navigation, while **leaf nodes** store ordered sequences of keys (datoms).\n\n## Database Implementation\n\nIn DataScript, changes are made using transactions, which are represented as [structured data](https://docs.datomic.com/transactions/transaction-data-reference.html#tx-data). While a comprehensive understanding of the entire transaction process is not required, it’s important to note that transactions are represented as a collections of **datoms**. Each Datom in transaction includes a flag that indicates whether it will be **added** to or **removed** from the database.\n\nSince **persistent data structures** can lead to high overhead when updating the entire tree for every transaction, DataScript employs an optimization mechanism that relies on an append-only \"tail\" for managing updates:\n\n1. Changes are stored in the \"tail\".\n2. Once the size of the tail becomes comparable to a tree node, the \"tail\" is \"flushed\" into the tree.\n   For implementation details, see the [source code](https://github.com/tonsky/datascript/blob/fa222f7b1b05d4382414022ede011c88f3bad462/src/datascript/conn.cljc#L98).\n\n## Hazel's Peer\n\nIn Datomic and DataScript, separate APIs are used for querying and mutating data. The Peer library is responsible for querying data. Moreover, it executes queries using a local cache.\n\nUltimately, Datomic and DataScript provide low-level API for querying data:\n\n- [`seek-datoms` ](https://docs.datomic.com/clojure/index.html#datomic.api/seek-datoms)\n- [`datoms`](https://docs.datomic.com/clojure/index.html#datomic.api/datoms)\n\n*Hazel* implements a similar low-level API.\n\nFirst, let's consider the Datomic and DataScript implementations for the JVM.\nWhen querying data, they access storage segments stored remotely or in a local cache.\nThis access uses blocking I/O, and the result of the queries is a lazy sequence.\n\nHere are the advantages of this approach:\n\n  1. It allows processing data that exceeds the size of RAM.\n  2. It allows stopping lazy sequence consumption, which prevents further loading of the next segments.\n\nSecond, let's examine the DataScript implementation for ClojureScript (JS).\nIt shares the same codebase as the JVM implementation and, as a result, has the same API. However, in JavaScript, blocking I/O cannot be used for retrieving segments, unlike in the JVM. This limitation means that DataScript in JavaScript can operate only with data stored in RAM.\n\n*Hazel* is designed to overcome this limitation.\nIn JavaScript, the equivalent of lazy sequences is a [**Generator function**](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/function*#description\n) (`function*/yield`). However, since segments are requested asynchronously over the network, Hazel uses [**AsyncGenerator**](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/async_function*#description) to manage this process.\n\nHere are some examples:\n\n**A range query:**\n```javascript\nfor async (const [e, _a, _v] of db.ave.datoms('task/completed', true)) {\n  // Retrieve datoms with the attribute `task/completed` and value `true`.\n  // ...\n}\n```\n\n**Retrieving  all atriibutes of an Entity:**\n```javascript\nconst todo = {\n  id: e,\n}\nfor async (const [_e, a, v] of db.eav.datoms(e)) {\n  todo[a] = v;\n}\n```\n\n**NOTE:** The index name matters. In the first example, the **AVE** index is used, while in the second example, the **EAV** index is used.\n\nFinally, the Cache API is used to cache segments.\nFor more details, see the documentation [here](https://developer.mozilla.org/en-US/docs/Web/API/CacheStorage).\n\n## Learning by Example\n\nA motivated reader can easily grasp these concepts by reviewing the provided tests, which offer clear examples and practical insights. Additional details on persistent B+ trees and storage mechanisms in DataScript can be found in the following references:\n\n- [Persistent Sorted Set Documentation](https://github.com/tonsky/persistent-sorted-set)\n- [DataScript Storage](https://github.com/tonsky/datascript/blob/master/docs/storage.md)\n- [Datomic Data Model](https://docs.datomic.com/whatis/data-model.html#datalog)\n\n# Limitations\n\nI've only implemented the methods (r)seek and datoms over indexes.\n\nThe main drawback I've found is access control.\nFor instance, consider an Asana project; it has tasks and collaborators, and each collaborator can read all tasks of the project.\nI think we should store data per project database.\nHowever, the problem is that we cannot just add an external collaborator to a task of this project\nbecause our database is cached in the browser, and this collaborator will have access to all the data in the database.\n\nI have two solutions for this problem:\n\n+ Having an additional API for data access that runs queries on the backend, where we can easily handle access control, although we would lose caching.\n+ Spreading (replicate) copies of a task across many databases (projects).\n\nIf you have any thoughts, feel free to open an issue in this repository.\n\n# Future work\n\nIt is possible to implement [datalog](https://docs.datomic.com/query/query-executing.html)\nand other Datomic/DataScript APIs.\n\nIt may be possible to adopt this approach for the local-first paradigm.\nFor example, we can implement Conflict-free Replicated Data Types (CRDT)\nby writing database functions in JavaScript and optimistically transacting them on the frontend side.\n\n\n# How to run\n\n+ `docker build -t hazel . \u0026\u0026 docker run --rm -p 8080:8080 hazel`\n+ open http://localhost:8080\n\n\n# Dev\n\n```\nbun run test\n```\n\nBun builder is currently in beta and lacks some features:\n\n+ When using `build.js` for configuration, Bun reloads only the `build.js` file, not the other project files.\n+ Automatic generation of manifest.json is not yet implemented.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdarkleaf%2Fhazel","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdarkleaf%2Fhazel","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdarkleaf%2Fhazel/lists"}