{"id":16349119,"url":"https://github.com/filipesilva/fdb","last_synced_at":"2025-09-17T13:30:51.980Z","repository":{"id":206742460,"uuid":"716822405","full_name":"filipesilva/fdb","owner":"filipesilva","description":"Reactive database environment for your files.","archived":false,"fork":false,"pushed_at":"2024-09-07T12:44:08.000Z","size":3691,"stargazers_count":91,"open_issues_count":1,"forks_count":2,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-01-02T21:12:26.534Z","etag":null,"topics":["automation","clojure","database","datalog","pkm","watcher"],"latest_commit_sha":null,"homepage":"","language":"Clojure","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/filipesilva.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-11-10T00:18:56.000Z","updated_at":"2025-01-02T09:20:16.000Z","dependencies_parsed_at":"2023-11-12T01:25:54.215Z","dependency_job_id":"6cd5abdc-f247-47ec-883f-f3eaed653d26","html_url":"https://github.com/filipesilva/fdb","commit_stats":null,"previous_names":["filipesilva/fdb"],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/filipesilva%2Ffdb","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/filipesilva%2Ffdb/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/filipesilva%2Ffdb/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/filipesilva%2Ffdb/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/filipesilva","download_url":"https://codeload.github.com/filipesilva/fdb/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":233383418,"owners_count":18668157,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["automation","clojure","database","datalog","pkm","watcher"],"created_at":"2024-10-11T00:57:46.187Z","updated_at":"2025-09-17T13:30:45.893Z","avatar_url":"https://github.com/filipesilva.png","language":"Clojure","funding_links":[],"categories":[],"sub_categories":[],"readme":"# FileDB\n\n\u003e Do not go gentle into that good SaaS,  \nUsers should burn and rave at close of data;  \nRage, rage against the dying of the file.\n\nFileDB is a reactive database environment for your files.\nIt's main purpose is to give you an easy way to take control of your data.\n\nIt watches files on disk and loads their data to a database.\nYou use [Clojure](https://clojure.org) and [XTDB](https://xtdb.com) to interact with this data, and add reactive triggers for automation.\n\nCheck the [Demo](#demo) and [Reference](#reference) to see what interacting with FileDB looks like.\n[More Demos](#more-demos) has examples of cool things I do with it.\n\nThe database is determined by the files on disk, so you can replicate it wholly or partially by syncing these files.\nAnything that syncs files to your disk can trigger your automations, so it's easy to save a file on your phone to trigger computation on your laptop.\n\nFileDB is for all those things you know you could do with code, but it's never worth the effort to set everything up.\nI use it to hack together code and automations for my own use cases.\nI like using markdown files on [Obsidian](https://obsidian.md) as my main readable files, but I don't think that matters.\nFileDB should let you hack your own setup.\n\n\n## Videos\n\n- [London Clojurians on 2024-05-28](https://www.youtube.com/watch?v=EvAFEC6n7NI)\n\n\n## Demo\n\nThis is Clojure heavy so it's probably best if you're a [Clojure](https://clojure.org) dev with [Datalog](https://en.wikipedia.org/wiki/Datalog) chops.\n\nBut if you're not, it's a great time to start.\nClojure and Datalog are awesome!\nCourage Wolf offers great advice here: bite off more than you can chew, then chew it.\n\nFirst step is to clone and start watching.\nMake sure to have [Clojure](https://clojure.org/guides/install_clojure) and [Babashka](https://github.com/babashka/babashka#installation) installed first.\n\n```sh\n# go into a folder where you can clone fdb into\ngit clone https://github.com/filipesilva/fdb\ncd fdb\n./symlink-fdb.sh\nfdb init\nfdb watch\n```\n\n[CLI](#cli) explains what these commands do.\n\nThen in another terminal go to `~/fdb/user`, and run the code in the Data, Code, and Network sections below.\nRun the `cat` commands separately, since it will take some ms for FileDB to act on file changes.\nSome `echo` commands are using `\u003e\u003e` instead of `\u003e` to append so existing content isn't lost.\n\nThe `~/fdb/user` folder is created by FileDB and automatically watched.\nIt's where you can play with code and data without thinking too much about it.\n\n```sh\ncd ~/fdb/user\n```\n\nYou can use an editor instead of `echo`/`cat`.\nI'm using shell commands here for demo brevity.\n\n\n### Data\n\nFileDB loads data on watched files that it can [read](#readers). [EDN](https://en.wikipedia.org/wiki/Clojure#Extensible_Data_Notation) map data in watched folders is automatically added to the database.\n\n```sh\necho '{:tags #{\"demo\"}}' \u003e data.edn\n```\n\nYou can query for the data in it with a query file.\nQuery files end with `query.fdb.edn`.\nThe result of the query will be in `query-out.fdb.edn`.\nQueries are in [XTDB datalog](https://v1-docs.xtdb.com/language-reference/datalog-queries/), and this one means \"get me all data for files with demo in the tags property\".\n\n```sh\necho '\n{:find [(pull ?e [*])]\n :where [[?e :tags \"demo\"]]}\n' \u003e query.fdb.edn\n\ncat query-out.fdb.edn\n```\n\n```edn\n#{[{:tags #{\"demo\"}\n    :fdb/modified #inst \"2024-04-23T22:49:27.271946426Z\"\n    :fdb/parent \"/user\"\n    :xt/id \"/user/data.edn\"}]}\n```\n\nData in JSON maps and Markdown [yml properties](https://help.obsidian.md/Editing+and+formatting/Properties#Property+format) is also automatically loaded.\n\n```sh\necho '{\"tags\": [\"demo\"]}' \u003e data.json\necho '---\ntags:\n  - demo\n---\nMarkdown body is not loaded\n' \u003e data.md\ntouch query.fdb.edn\n\ncat query-out.fdb.edn\n```\n\n```edn\n#{[{:tags [\"demo\"]\n    :fdb/modified #inst \"2024-04-23T22:50:12.791335391Z\"\n    :fdb/parent \"/user\"\n    :xt/id \"/user/data.md\"}]\n  [{:tags [\"demo\"]\n    :fdb/modified #inst \"2024-04-23T22:50:12.791511140Z\"\n    :fdb/parent \"/user\"\n    :xt/id \"/user/data.json\"}]\n  [{:tags #{\"demo\"}\n    :fdb/modified #inst \"2024-04-23T22:49:27.271946426Z\"\n    :fdb/parent \"/user\"\n    :xt/id \"/user/data.edn\"}]}\n```\n\nFiles without a reader only get `:xt/id`, `:fdb/modified`, and `:fdb/parent` in the db.\n\n```sh\necho 'just a txt' \u003e file.txt\necho '\n{:find [(pull ?e [*])] \n :where [[?e :xt/id \"/user/file.txt\"]]}\n' \u003e file-query.fdb.edn\n\ncat file-query-out.fdb.edn\n```\n\n```edn\n#{[{:fdb/modified #inst \"2024-04-23T22:50:32.953604479Z\"\n    :fdb/parent \"/user\"\n    :xt/id \"/user/file.txt\"}]}\n```\n\nBut you can add metadata to any file via a sibling file that ends in `.meta.edn`.\n\n```sh\necho '{:tags #{\"demo\"}}' \u003e file.txt.meta.edn\ntouch query.fdb.edn\n\ncat query-out.fdb.edn\n```\n\n```edn\n#{[{:tags [\"demo\"]\n    :fdb/modified #inst \"2024-04-23T22:50:12.791335391Z\"\n    :fdb/parent \"/user\"\n    :xt/id \"/user/data.md\"}]\n  [{:tags [\"demo\"]\n    :fdb/modified #inst \"2024-04-23T22:50:12.791511140Z\"\n    :fdb/parent \"/user\"\n    :xt/id \"/user/data.json\"}]\n  [{:tags #{\"demo\"}\n    :fdb/modified #inst \"2024-04-23T22:49:27.271946426Z\"\n    :fdb/parent \"/user\"\n    :xt/id \"/user/data.edn\"}]\n  [{:tags #{\"demo\"}\n    :fdb/modified #inst \"2024-04-23T22:51:32.639683791Z\"\n    :fdb/parent \"/user\"\n    :xt/id \"/user/file.txt\"}]}\n```\n\n\n### Code\n\nClojure code in repl files is evaluated in the Clojure process under the `user` namespace.\nREPL files end with `repl.fdb.clj`.\nThe result of the execution will be in a comment in `repl-out.fdb.clj`, together with the executed code.\n\nFileDB starts a [nREPL server](https://nrepl.org/nrepl/index.html) on port 2525 that you can connect to.\nIn this demo we're going to focus on the file watcher use, but feel free to use both ways.\n\n```sh\necho '(inc 1)' \u003e repl.fdb.clj\n\ncat repl-out.fdb.clj\n```\n\n```clojure\n(inc 1)\n\n;; 2\n```\n\nSince the code is evaluated into a persistent process, you can define a function in one file and use it in another.\n\n```sh\necho '(defn my-inc [x] (inc x))' \u003e repl.fdb.clj\n\ncat repl-out.fdb.clj \n```\n\n```clojure\n(defn my-inc [x] (inc x))\n\n;; =\u003e #'user/my-inc\n```\n\nYou can call this function by it's namespaced name (`user/my-inc`) or by `my-inc`, since all repl files start in the `user` namespace.\n\n``` sh\necho '(+ (user/my-inc 1) (my-inc 1))' \u003e another-repl.fdb.clj\n\ncat another-repl-out.fdb.clj\n```\n\n``` clojure\n(+ (user/my-inc 1) (my-inc 1))\n\n;; =\u003e 4\n```\n\nCode in a repl file was loaded into the process but if you kill the process it's gone.\nRepl files aren't automatically loaded on startup because then you'd have to watch out what you write in them to avoid slowing down startup.\n\nIn `fdbconfig.edn` there's a load vector where you can put `.clj` files that will be loaded on startup.\nThere's `[\"load-repl.fdb.clj\" \"server-repl.fdb.clj\"]` in there.\nThey are repl files so any changes are immediately loaded.\n\nYou have full access to the database from code files, so you can use the [XTDB API](https://v1-docs.xtdb.com/language-reference/datalog-queries/) directly.\nYou can get the current node via `(fdb.db/node)`.\n\n``` sh\necho '\n(xtdb.api/q\n (xtdb.api/db (fdb.db/node))\n \\'{:find [(pull ?e [*])]\n    :where [[?e :tags \"demo\"]]})\n' \u003e repl.fdb.clj\n\ncat repl-out.fdb.clj\n```\n\n```clojure\n(xtdb.api/q\n (xtdb.api/db (fdb.db/node))\n '{:find [(pull ?e [*])]\n   :where [[?e :tags \"demo\"]]})\n\n;; =\u003e #{[{:tags #{\"demo\"},\n;;        :fdb/modified #time/instant \"2024-05-16T14:52:32.352959185Z\",\n;;        :fdb/parent \"/user\",\n;;        :xt/id \"/user/data.edn\"}]\n;;      [{:tags [\"demo\"],\n;;        :fdb/modified #time/instant \"2024-05-16T14:52:41.886827051Z\",\n;;        :fdb/parent \"/user\",\n;;        :xt/id \"/user/data.md\"}]\n;;      [{:tags [\"demo\"],\n;;        :fdb/modified #time/instant \"2024-05-16T14:52:41.886982426Z\",\n;;        :fdb/parent \"/user\",\n;;        :xt/id \"/user/data.json\"}]}\n```\n\nYou were able to call `xtdb.api/q` directly via its fully qualified name because the library was already loaded into the FileDB process.\nBut you can do `(require '[xtdb.api :as xt])` instead if you want.\n\nThe `fdb.db` namespace contains a convenience function `xtdb.api/q` that uses the current db so you don't have to call `(xtdb.api/db (fdb.db/node))` all the time.\n\n``` sh\necho '\n(fdb.db/q\n \\'{:find [(pull ?e [*])]\n    :where [[?e :tags \"demo\"]]})\n' \u003e repl.fdb.clj\n\ncat repl-out.fdb.clj\n```\n\n```clojure\n(fdb.db/q\n '{:find [(pull ?e [*])]\n    :where [[?e :tags \"demo\"]]})\n\n;; =\u003e #{[{:tags #{\"demo\"},\n;;        :fdb/modified #time/instant \"2024-05-16T14:52:32.352959185Z\",\n;;        :fdb/parent \"/user\",\n;;        :xt/id \"/user/data.edn\"}]\n;;      [{:tags [\"demo\"],\n;;        :fdb/modified #time/instant \"2024-05-16T14:52:41.886827051Z\",\n;;        :fdb/parent \"/user\",\n;;        :xt/id \"/user/data.md\"}]\n;;      [{:tags [\"demo\"],\n;;        :fdb/modified #time/instant \"2024-05-16T14:52:41.886982426Z\",\n;;        :fdb/parent \"/user\",\n;;        :xt/id \"/user/data.json\"}]}\n```\n\nYou have convenience functions for `pull`, `pull-many`, `entity` and `entity-history`.\n`entity-history` is a cool one because it gives you all past versions of that id, as long as the database wasn't deleted.\nPast content is in `xtdb.api/doc`.\n\n```sh\necho '{:tags #{\"demo\"} :new-k 1}' \u003e data.edn\necho '(fdb.db/entity-history \"/user/data.edn\" :asc :with-docs? true)' \u003e repl.fdb.clj\n\ncat repl-out.fdb.clj\n```\n\n```clojure\n(fdb.db/entity-history \"/user/data.edn\" :asc :with-docs? true)\n\n;; =\u003e [{:xtdb.api/tx-time #inst \"2024-05-16T14:52:32.889-00:00\",\n;;      :xtdb.api/tx-id 32,\n;;      :xtdb.api/valid-time #inst \"2024-05-16T14:52:32.889-00:00\",\n;;      :xtdb.api/content-hash\n;;      #xtdb/id \"1d10305d968b37f961fb664490fd69165c775440\",\n;;      :xtdb.api/doc\n;;      {:tags #{\"demo\"},\n;;       :fdb/modified #time/instant \"2024-05-16T14:52:32.352959185Z\",\n;;       :fdb/parent \"/user\",\n;;       :xt/id \"/user/data.edn\"}}\n;;     {:xtdb.api/tx-time #inst \"2024-05-16T15:31:59.275-00:00\",\n;;      :xtdb.api/tx-id 47,\n;;      :xtdb.api/valid-time #inst \"2024-05-16T15:31:59.275-00:00\",\n;;      :xtdb.api/content-hash\n;;      #xtdb/id \"d7bde02d1000fb3444558d26994b8ac30b93b295\",\n;;      :xtdb.api/doc\n;;      {:tags #{\"demo\"},\n;;       :new-k 1,\n;;       :fdb/modified #time/instant \"2024-05-16T15:31:58.744451751Z\",\n;;       :fdb/parent \"/user\",\n;;       :xt/id \"/user/data.edn\"}}]\n```\n\nYou can add code that will be run reactively on data and metadata.\nThese are called triggers, and you can read more about different triggers in in [Metadata](#metadata).\nYou can read more about the function format in triggers and the arguments it takes in [call-spec and call-arg](#call-spec-and-call-arg).\n\nThis trigger will be called every time the file is modified and keep an audit log file of all modification dates.\n\n```sh\necho '\n{:tags #{\"demo\"}\n :fdb.on/modify (fn [{:keys [self-path tx]}]\n                  (spit (str self-path \".audit\")\n                        (-\u003e tx :xtdb.api/tx-time .toInstant (str \"\\n\"))\n                        :append true))}\n' \u003e file.txt.meta.edn\ntouch file.txt\ntouch file.txt\n\ncat file.txt.audit\n```\n\n```\n2024-05-16T20:52:50.845Z\n2024-05-16T20:52:59.819Z\n2024-05-16T20:53:00.820Z\n```\n\nYou don't have to code in metadata though.\nYou can make a function in a repl file and then use it in a trigger.\nAdd functions you want to use in triggers to a loaded file like `load-repl.fdb.clj` so that they are always available.\n\n```sh\necho '\n(defn audit [{:keys [self-path tx]}]\n  (spit (str self-path \".audit\")\n        (-\u003e tx :xtdb.api/tx-time .toInstant (str \"\\n\"))\n        :append true))\n' \u003e\u003e load-repl.fdb.clj\necho '\n{:tags #{\"demo\"}\n :fdb.on/modify user/audit}\n' \u003e file.txt.meta.edn\n```\n\n\n### Network\n\nAnything that can sync files over network can interact with FileDB.\nYou can sync data files from one machine to another that is running FileDB, and if that file is watched, it will be loaded.\nYou can sync repl files, which will cause them to be evaluated, and then sync back the `repl-out.fdb.clj` to see the result.\n\nFileDB has a built-in [http-kit](https://github.com/http-kit/http-kit) server that maps routes to functions.\nHandlers receive a [call-arg](#call-spec-and-call-arg) with `:req`.\n\n```sh\necho '\n(defn foo [{:keys [req]}]\n  {:body {:bar \"baz\"}})\n' \u003e\u003e server-repl.fdb.clj\n# set fdbconfig.edn :server :routes to {\"GET /foo\" user/foo}\n\ncurl localhost:80/foo\n```\n\n```json\n{\"bar\":\"baz\"}\n```\n\nContent is negotiated automatically via [Muuntaja](https://github.com/metosin/muuntaja).\nRoutes are order independent thanks to [clj-simple-router](https://github.com/tonsky/clj-simple-router).\n\nThere's a convenience function to render [Hiccup](https://github.com/escherize/huff) in `fdb.http/render` that you can use without having to import Hiccup.\nYou can use Hiccup together with [HTMX](https://htmx.org) to quickly whip up UI for FileDB.\n\n```sh\necho '\n(defn clicker [_]\n  {:body\n   (fdb.http/render\n    [:\u003c\u003e\n     [:script {:src \"https://unpkg.com/htmx.org@1.9.12\"}]\n     [:button {:hx-post \"/clicked\" :hx-swap \"outerHTML\"}\n      \"You know what they call a Quarter Pounder with Cheese in Paris?\"]])})\n\n(defn clicked [_]\n  {:body\n   (fdb.http/render\n    [:div \"They call it Royale with Cheese.\"])})\n' \u003e\u003e server-repl.fdb.clj\n# set fdbconfig.edn :server :routes to \n# {\"GET /\" user/clicker \"POST /clicked\" user/clicked}\n```\n\nGo to http://localhost:80 to learn about the little differences between the US and Europe.\nYou can use [ngrok](https://ngrok.com) for free to share this server with others. \nRun `ngrok http 80` after setting ngrok up, and share the link it gives you under `Forwarding`.\nIn the [ngrok dashboard](https://dashboard.ngrok.com/get-started/setup) you have the CLI args to use a static domain so your server is always up at the same address.\n\nThe `fdb.http` namespace has helpers to interact with existing APIs.\nThis code will get you geo data for the city of Lisbon, Portugal.\n\n```sh\necho '\n(-\u003e \"https://nominatim.openstreetmap.org/search\"\n    (fdb.http/add-params {:q \"Lisbon\" :limit 1 :format \"json\"})\n    fdb.http/json\n    first)\n' \u003e repl.fdb.clj\n\ncat repl-out.fdb.clj\n```\n\n```clojure\n(-\u003e \"https://nominatim.openstreetmap.org/search\"\n    (fdb.http/add-params {:q \"Lisbon\" :limit 1 :format \"json\"})\n    fdb.http/json\n    first)\n\n;; =\u003e {:osm_type \"relation\",\n;;     :boundingbox [\"38.6913994\" \"38.7967584\" \"-9.2298356\" \"-9.0863328\"],\n;;     :name \"Lisboa\",\n;;     :type \"administrative\",\n;;     :licence\n;;     \"Data © OpenStreetMap contributors, ODbL 1.0. http://osm.org/copyright\",\n;;     :place_id 256327888,\n;;     :class \"boundary\",\n;;     :lon \"-9.1365919\",\n;;     :lat \"38.7077507\",\n;;     :addresstype \"city\",\n;;     :display_name \"Lisboa, Portugal\",\n;;     :osm_id 5400890,\n;;     :place_rank 14,\n;;     :importance 0.7149698324141975}\n```\n\nYou can get and send email using FileDB.\nFirst you need to add an `:email` key in `fdbconfig.edn` with email settings.\nFor gmail you'll need to add a [app password](https://support.google.com/mail/answer/185833?hl=en).\n\n```edn\n{:email {:email    \"your@email.com\"\n         :password \"password123\"\n         ;; has defaults for gmail, for other email providers you'll have to fill it in\n         ;; :imap {:host \"imap.gmail.com\"}\n         ;; :smtp {:host \"smtp.gmail.com\" :port 587}\n         }}\n```\n\nSync your mail to disk as `.eml` files with `fdb.email/sync`.\nIt will take up to 50 mails each time it runs, but you can configure it with `:take-n`.\nSet `:since 0` to sync since the first email.\n\n```sh\necho '\n{:fdb.on/schedule {:every  [5 :minutes]\n                   :call   fdb.email/sync\n                   :folder :all ;; gmail only, for others use string folder name\n                   :since  #inst \"2024-05-15\"}}\n' \u003e email.meta.edn\n```\n\nFileDB has default readers for `.eml` that extract common fields to EDN.\n\n```sh\necho '\n{:find [(pull ?e [*])]\n :where [[?e :fdb/parent \"/user/email\"]]}\n' \u003e query.fdb.edn\n\ncat query-out.fdb.edn\n```\n\n```edn\n#{[{:date #inst \"2024-05-27T22:26:32.000-00:00\"\n    :from [\"filedb.demo@gmail.com\"]\n    :labels [\"Unread\" \"Inbox\" \"Sent\"]\n    :message-id \"\u003c375215791.1.1716848792154@Filipes-MacBook-Pro.local\u003e\"\n    :subject \"test subject\"\n    :text \"test body\\n\"\n    :thread-id \"1800246439955148017\"\n    :to [\"filedb.demo@gmail.com\"]\n    :fdb/modified #inst \"2024-05-27T22:37:36.822301525Z\"\n    :fdb/parent \"/user/email\"\n    :xt/id \"/user/email/2024-05-27T22.26.32Z 78706fb8 test subject.eml\"}]\n  [{:date #inst \"2024-05-27T22:28:45.000-00:00\"\n    :from [\"filedb.demo@gmail.com\"]\n    :labels [\"Unread\" \"Inbox\" \"Sent\"]\n    :message-id \"\u003c63529253.0.1716848925340@[192.168.64.1]\u003e\"\n    :subject \"you got\"\n    :text \"mail\\n\"\n    :thread-id \"1800246579702404960\"\n    :to [\"filedb.demo@gmail.com\"]\n    :fdb/modified #inst \"2024-05-27T22:41:12.895430986Z\"\n    :fdb/parent \"/user/email\"\n    :xt/id \"/user/email/2024-05-27T22.28.45Z 185a9ab9 you got.eml\"}]}\n```\n\nSend mails with `fdb.email/send`.\nYou can send an email to yourself by setting `:to :self`, which is handy for notifications.\n\n```sh\necho '\n(fdb.email/send\n  (fdb.call/arg)\n  {:to \"them@email.com\"\n   :subject \"you got\"\n   :text \"mail\"})\n' \u003e repl.fdb.clj\n```\n\nIf you want to sync a lot of mail, it's much faster to use a `.mbox` export.\nFileDB has a helper to split a `.mbox` into `.eml` files.\nGMail has export instructions [here](https://support.google.com/mail/answer/10016932?hl=en).\n\n```sh\necho '\n(require \\'[fdb.email :as email]\n         \\'[babashka.fs :as fs])\n(email/split-mbox \"/Users/filipesilva/work/fdb/resources/email/sample-crlf.mbox\"\n                  (fs/file (fs/home) \"fdb/user/email\"))\n' \u003e repl.fdb.clj\n```\n\n\n```clojure\n(require '[fdb.email :as email]\n         '[babashka.fs :as fs])\n(email/split-mbox \"/Users/filipesilva/work/fdb/resources/email/sample-crlf.mbox\"\n                  (fs/file (fs/home) \"fdb/user/email\"))\n\n;; 2024-05-27T21:33:40.166Z Filipes-MacBook-Pro.local INFO [fdb.email:182] - writing #1 /Users/filipesilva/work/fdb/resources/email/sample-crlf/1970-01-01T00.00.00Z 8d247ee6 Sample message 1.eml\n;; 2024-05-27T21:33:40.167Z Filipes-MacBook-Pro.local INFO [fdb.email:182] - writing #2 /Users/filipesilva/work/fdb/resources/email/sample-crlf/1970-01-01T00.00.00Z c2dfc80c Sample message 2.eml\n;; =\u003e nil\n```\n\n## What are the main ideas in it?\n\nThe main idea in FileDB is that you can use files as both data and code for a long lived process that you build over time.\n\nThis process is yours, and you can do cool stuff with it.\n\nHere's what the terms your see in this README mean:\n- mount: the name a watched folder on disk has on the db\n- repl/query file: file that evaluates code or db queries on save, outputs result to a sibling file\n- reader: a fn that takes a file and returns data from it as edn, which is loaded into the db\n- metadata: extra data about a file you add in a sibling .meta.edn file\n- trigger: fn in metadata called reactively as the db changes\n- call spec/arg: how fns are specified for readers and triggers, and the argument they take\n\n```\n            ------------\u003e repl/query -------\u003e clojure process\n            |\nmount --\u003e file change --\u003e readers+metadata -\u003e db --\u003e triggers\n            ^                                           |\n            |                                           |\n            ---------------------------------------------\n```\n\n\n## More Demos\n\nBelow is cool stuff that you can do with FileDB.\nIf you want to follow these demos, add their dir as a mount.\n\n- [~/demos/clojuredocs](./demos/clojuredocs/README.md): query and scrape [clojuredocs](https://clojuredocs.org) results whenever you write to a file\n- [~/demos/temp](./demos/temp/README.md): keep track of max/min temperature and query for the hottest day in the week\n- WIP [`~/demos/nutrition`](./demos/nutrition/README.md): make your own nutrition tracking system\n- TODO `~/demos/email`: sync all of your emails locally, connect them with your notes\n- TODO `~/demos/code-analysis`: read AST for clj files, query it to find what fns are affected when a given fn changes\n\nI'm working on more demos around my own usecases.\nI'll add them here when they are done.\nIf you have cool demos you'd like to list here, make a PR!\n\n\n## But why?\n\nBecause I think it's silly that I own a powerful laptop and a powerful phone, and yet my data is sequestered away on cloud servers, where I pay for the privilege of accessing it and using it in a silo.\n\nI want my data, and I want to fuck around with it on my terms.\nI want to connect it together and try to do cool stuff with it!\nAnd I want to sync it between my laptop and my phone, and wherever else I want to have it.\n\nLast year I was travelling somewhere with bad connectivity and wanted to look up food nutrition data on my phone.\nThis is known data.\nSurely there's an app for that.\nI tried 10 apps free and paid, and found they were mostly garbage.\nI don't think I found even one that worked offline and had the 5 foods I tested.\n\nWhy is this hard to get?!\nThe USDA gives you a [24MB CSV](https://fdc.nal.usda.gov/download-datasets.html) of all foundation foods.\n2.8GB if you want all foods, the bulk of it branded.\nI know how to program.\nMy laptop has a 1TB disk, and my phone has a 512GB.\nWhy do I go mucking around with garbage apps instead of using this available data?\n\nIt's not terribly hard to load this into a database.\nBut it is hard to sync databases, and to open them in different devices.\nYou know what's really easy to sync and open though?\n\nFiles.\n\nYou have iCloud, Google Drive, Dropbox, Syncthing, Git, and a ton of other apps to sync.\nYou have apps that open your on-disk files.\nLots of these cloud services give you a way to download all of your data.\n\nSo that got me thinking about doing a database that was mostly a queryable layer over disk files.\nI then I added more stuff to it that I thought was cool, like reactive triggers, a live system, and a http server.\n\n\n## CLI\n\nYou'll need to have [Clojure](https://clojure.org/guides/install_clojure) and [Babashka](https://github.com/babashka/babashka#installation) installed first.\nInstalling the FileDB CLI is just cloning this repo and symlinking the CLI script.\n\n``` sh\n# go into a folder where you can clone fdb into\ngit clone https://github.com/filipesilva/fdb\ncd fdb\n./symlink-fdb.sh\n```\n\nNow you should be able to run `fdb help` from anywhere.\nIf you don't want to symlink the CLI script, you can call `./src/fdb/bb/cli.clj help` from this dir. That's what `./symlink-fdb.sh` is linking.\n\nStart using FileDB by running `fdb init`.\nThis will create `~/fdb/` with `fdbconfig.edn`, `user/`, and `demos/` inside.\nIf you want to create the `fdb` folder in the current (or other) dir, add it at the end of init like `fdb init .`.\n\nThen run `fdb watch`.\nYou can edit the config anytime, for instance to add new mounts, and the watcher will restart automatically.\nIt starts a [nREPL server](https://nrepl.org/nrepl/index.html) on port 2525 that you can connect to from a Clojure editor like [Emacs with Cider](https://github.com/clojure-emacs/cider) or [VSCode with Calva](https://calva.io).\n\nIt will watch folders under the `:mount` key in `fdbconfig.edn`.\nModified date, parent, and path for each file on mounts will be added to the db.\nIf there's a reader for that file type, the extracted data will be added.\nIf you have `doc.md`, and add a `doc.md.meta.edn` next to it, that edn data will be added to the db's `doc.md` id.\nYou can put triggers and whatever else you want in this edn file.\n\nDeleted files are removed from the database.\nBut since XTDB is a [bitemporal](https://v1-docs.xtdb.com/concepts/bitemporality/) database, you can query for past versions of the database.\n`fdb watch` will pick up files that changed since it was last running, including deletions.\n\n`~/fdb/user/` has a [repl and query file](repl-and-query-files) to play with.\nThe [Reference](#reference) is in `~/fdb/demos/reference` and contains examples of how things work.\n\nYou can also run `fdb sync` to do a one-shot sync.\nThis is useful when you have automation you want to run manually.\nIt doesn't run `fdb.on/schedule` though.\n\n`fdb read glob-pattern` forces a read of the (real, not mount) paths.\nThis is useful when you add or update readers and want to re-read those files.\n\n\n## Reference\n\nReference files are in `~/fdb/demos/reference` folder but are not mounted.\nYou can mount them if you want.\nI've gathered them here to give a nice overview of what you can do, and so its easy to search over them.\n\n### Readers\n\nFileDB comes with these default readers to make it easy to interact with common data:\n- `edn`: reads all data in edn files is loaded directly into the db\n- `json`: reads all data, keywordizing keys\n- `md`: reads links into `:fdb/refs`, and all [yml properties](https://help.obsidian.md/Editing+and+formatting/Properties#Property+format). Property keys that start with `fdb` are read as edn.\n- `eml:` reads common email keys from the email message headers, and tries to read body as text\n\nTriggers in the return return map work the same as those in metadata files.\n\nYou can make your own readers too.\nCheck out how the default ones are implemented in `src/fdb/readers/`.\n\n\n### Repl and Query files\n\nYou can run code over the db process with a file called `repl.fdb.clj`, with any prefix e.g. `foo.repl.fdb.clj`.\n`repl.fdb.md` also works if the clojure code is in a solo `clojure` codeblock.\nYou can connect your editor to the nREPL server that starts with `fdb watch`, it's on port 2525 by default.\n\nIt starts in the `user` namespace but you can add whatever namespace form you want, and that's the ns it'll be eval'd in.\nYou can find a call-arg like the one triggers receive in `(fdb.call/arg)` (more on `call-arg` in the later in the reference).\n\n``` clojure\n;; We'll use this fn later in triggers.\n;; Add to the load vector if adding reference as a mount \n;; so it's acessible on first load.\n(defn print-call-arg\n  \"Simple fn to see which triggers are called.\"\n  [{:keys [on-path]}]\n  (println \"=== called\" (first on-path) \"===\"))\n```\n\nWill write the evaluated code with output in a comment to `repl-out.fdb.clj`:\n\n``` clojure\n;; We'll use this fn later in triggers.\n;; Add to the load vector if adding reference as a mount \n;; so it's acessible on first load.\n(defn print-call-arg\n  \"Simple fn to see which triggers are called.\"\n  [{:keys [on-path]}]\n  (println \"=== called\" (first on-path) \"===\"))\n\n;; =\u003e #'user/print-call-arg\n```\n\nYou can add a repl file (or any clj file) to `fdbconfig.edn` under `:load` to be loaded at startup, and the functions you define in it will be available for triggers and readers before sync.\n\nYou can query the db with a file called `query.fdb.edn`, with any prefix.\nAlso works for `query.fdb.md` if query is in a solo `edn` codeblock.\nSee [XTDB docs](https://v1-docs.xtdb.com/language-reference/datalog-queries/) for query syntax, and [Learn Datalog Today!](https://www.learndatalogtoday.org) if you want to learn about Datalog from scratch.\n\n``` edn\n{:find [?e] \n :where [[?e :tags \"important\"]]}\n```\n\nWill output to `query-out.fdb.edn`:\n\n``` edn\n#{[\"/demos/reference/todo.md\"] [\"/demos/reference/doc.md\"]}\n```\n\n\n### `fdbconfig.edn`\n\n``` edn\n;; This is the format of the fdb config file. Your real one is probably on your ~/fdb/fdbconfig.edn.\n{;; Where the xtdb db files will be saved.\n ;; You can delete this at any time, and the latest state will be recreated from the mount files.\n ;; You'll lose time-travel data if you delete it though.\n ;; See more about xtdb time travel in https://v1-docs.xtdb.com/concepts/bitemporality/.\n :db-path \"./xtdb\"\n\n ;; These paths that will be mounted on the db.\n ;; If you have ~/fdb/user mounted as :user, and you have ~/fdb/user/repl.fdb.clj,\n ;; its id in the db will be /user/repl.fdb.clj.\n :mounts {;; \"~/fdb/user is the same as {:path \"~/fdb/user\"}\n          :user \"~/fdb/user\"}\n\n ;; Readers are fns that will read some data from a file as edn when it changes\n ;; and add it to the db together with the metadata.\n ;; The key is the file extension.\n ;; They are called with the call-arg (see below) just like triggers.\n ;; Call `fdb read glob-pattern` if you change readers and want to force a re-read.\n ;; Defaults to :edn, :md, and :eml readers in fdb/src/readers but can be overwritten\n ;; with :readers instead of :extra-readers.\n ;; You can also add :extra-readers to a single mount in the map notation.\n :extra-readers {:txt user/read-txt}\n\n ;; Mount or real paths of clj files to be loaded at the start.\n ;; Usually repl files where you added fns to use in triggers, or that load namespaces\n ;; you want to use without requiring them, or server handlers.\n :load [\"/user/load-repl.fdb.clj\"\n        \"/user/server-repl.fdb.clj\"]\n\n ;; These are Clojure deps loaded dynamically at the start, and reloaded when config changes.\n ;; You can add your local deps here too, and use them in triggers.\n ;; See https://clojure.org/guides/deps_and_cli for more about deps.\n :extra-deps {org.clojure/data.csv  {:mvn/version \"1.1.0\"}\n              org.clojure/data.json {:git/url \"https://github.com/clojure/data.json\"\n                                     :git/sha \"e9e57296e12750512788b723e49ba7f9abb323f9\"}\n              my-local-lib          {:local/root \"/path/to/lib\"}}\n\n ;; Serve call-specs from fdb.\n ;; Use with https://ngrok.com or https://github.com/localtunnel/localtunnel to make a public server.\n :serve {;; Map from route to call-spec, req will be within call-arg as :req.\n         ;; Route format is from https://github.com/tonsky/clj-simple-router.\n         :routes {\"GET /\"        user/get-root\n                  \"GET /stuff/*\" user/get-stuff}\n         ;; Server options for https://github.com/http-kit/http-kit.\n         ;; Defaults to {:port 80}.\n         :opts   {:port 8081}}\n\n ;; Files and folders to ignore when watching for changes.\n ;; default is [\".DS_Store\" \".git\" \".gitignore\" \".obsidian\" \".vscode\" \"node_modules\" \"target\" \".cpcache\"]\n ;; You can add to the defaults with :extra-ignore, or overwrite it with :ignore.\n ;; You can also use :ignore and :extra-ignore on the mount map definition.\n :extra-ignore [\".noisy-folder\"]\n\n ;; nRepl options, port defaults to 2525.\n ;; Started automatically on watch, lets you connect directly from your editor to the fdb process.\n ;; Also used by the fdb cli to connnect to the background clojure process.\n ;; See https://nrepl.org/nrepl/usage/server.html#server-options for more.\n :repl {}\n\n ;; You can add your own stuff here, and since the call-arg gets the config you will\n ;; be able to look up your config items on triggers and readers.\n :my-stuff \"personal config data I want to use in fns\"}\n```\n\n\n### `call-spec` and `call-arg`\n\nReaders and triggers take a `call-spec` that can be a few different things, and will receive `call-arg`:\n\n``` edn\n;; These are the different formats supported for call-spec, used in readers and triggers.\n\n;; A function name will be required and resolved under the user ns, then called with call-arg.\nprintln\nclojure.core/println\n\n;; A sexp containing a function, evaluated then called with call-arg.\n(fn [{:keys [self-path]}]\n  (println self-path))\n\n;; A vector uses the first kw element to decide what to do.\n;; The only built-in resolution is :sh, that calls a shell command and can use a few bindings from call-arg.\n;; You can add your own with (defmethod fdb.call/to-fn :my-thing ...)\n;; See ./src/fdb/call for existing ones.\n[:sh \"echo\" config-path target-path self-path]\n\n;; A map containing :call, which is any of the above\n;; You can put more data in this map, and since call-arg has the trigger iself in :on, you can use\n;; this data to parametrize the call.\n{:call    (fn [{:keys [self-path on]}]\n            (println self-path (:my-data on)))\n :my-data 42}\n\n;; Call-specs can always be one or many, and are called in sequence.\n[println \n {:call println}\n [:sh \"echo\" self-path]]\n```\n\n\nThe `call-arg` is the single map arg that `call-spec` is called with.\nIt looks like this:\n\n``` edn\n;; This is the format for call arg, which the function resolved for call-spec is called with.\n;; It's also acessible in (fdb.call/arg).\n{:config      {,,,}                         ;; fdb config value\n :config-path \"~/fdb/fdbconfig.json\"        ;; on-disk path to config\n :node        {,,,}                         ;; xtdb database node\n :db          {,,,}                         ;; xtdb db value at the time of the tx\n :tx          {:xtdb.api/id 1 ,,,}          ;; the tx\n :on          println                       ;; the trigger being called\n :on-path     [:fdb.on/modify]              ;; get-in path inside self for trigger\n :self        {:xt/id \"/mount/foo.md\" ,,,}  ;; the doc that has the trigger being called\n :self-path   \"/path/foo.md\"                ;; on-disk path for self\n :target      {:xt/id \"/mount/bar.md\" ,,,}  ;; the doc the trigger is being called over, if any\n :target-path \"/path/bar.md\"                ;; on-disk path for doc, if any\n :results     {,,,}                         ;; query results, if any\n :timestamp   \"2024-03-22T16:52:20.995717Z\" ;; schedule timestamp, if any\n :req         {:path-params [\"42\"] ,,,}     ;; http request, if any\n }\n```\n\n\n### Metadata\n\nMetadata is any data you want to put into the DB for your files.\nThen you can query it.\n\nThere's two sources of metadata:\n- readers: for the file extension, edn/md/json/eml are built-in\n- metadata files: `doc.md.meta.edn` is a metadata file for `doc.md`\nBoth are loaded into the database whenever the file changes.\n\nKeys on the `fdb` namespace have special meaning for FileDB.\nReactive triggers are on the `fdb.on` namespace.\n\n``` edn\n;; This both the format of db and on-disk metadata files.\n{;; ID is /mount/ followed by relative path on mount.\n ;; It's the unique id for XTDB.\n ;; Added automatically.\n :xt/id           \"/demos/reference/doc.md\"\n\n ;; Modified is the most recent between doc.md and doc.md.meta.edn.\n ;; Added automatically.\n :fdb/modified    \"2021-03-21T20:00:00.000-00:00\"\n\n ;; The ID of the parent of this ID, useful for recursive queries\n ;; Added automatically.\n :fdb/parent      \"/demos/reference\"\n\n ;; ID references are useful enough in relating docs that they're first class.\n :fdb/refs        #{\"/demos/reference/todo.md\"\n                    \"/demos/reference/ref-one.md\"}\n\n ;; Called when this file, or its metadata, is modified.\n ;; The fn will be called with the call-arg.\n ;; print-call-arg is a function that we added in repl.fdb.edn, so we can use it here.\n :fdb.on/modify   print-call-arg\n\n ;; Called when any file that matches the glob changes.\n ;; It should match ./pattern-glob-match.md.\n :fdb.on/pattern  {:glob \"/demos/reference/*glob*.md\"\n                   :call print-call-arg}\n\n ;; Called when the files referenced in :fdb/refs change.\n ;; Refs will be resolved recursively and you can have cycles, so this triggers\n ;; when ./ref-two.md or ./ref-three are modified too.\n :fdb.on/refs     print-call-arg\n\n ;; Called when the query results change.\n ;; The latest results will be in important-files.edn, specified in the :path key.\n ;; You can add triggers to path metadata, or use it as a ref to other triggers.\n :fdb.on/query    {:q    [:find ?e\n                          :where [?e :tags \"important\"]]\n                   :path \"./important-files.edn\"\n                   :call print-call-arg}\n\n ;; Called every 1 hours.\n ;; The :every syntax supports :seconds :hours :days and more, see (keys tick.core/unit-map).\n ;; You can also use a cron schedule, use https://crontab.guru/ to make your cron schedules.\n :fdb.on/schedule {:every [1 :hours]\n                   ;; or :cron \"0 * * * *\"\n                   :call print-call-arg}\n\n ;; Called once on watch startup/shutdown, including restarts.\n :fdb.on/startup  print-call-arg\n :fdb.on/shutdown print-call-arg\n\n ;; Called on every db transaction via https://v1-docs.xtdb.com/clients/clojure/#_listen\n ;; This is how every other trigger is made, so you can make your own triggers.\n :fdb.on/tx       print-call-arg\n\n ;; Prevents all reactive triggers from running when this file changes.\n ;; Use when you you have triggers that their own file, like with email sync.\n ;fdb.on/ignore   true\n }\n```\n\n\n## Hacking on FileDB\n\n[ARCHITECTURE.md](ARCHITECTURE.md) (TODO) has an overview of the main namespaces in FileDB and how they interact.\n\n`fdb watch --debug` starts fdb with extra debug logging.\nConnect to the [nREPL server](https://nrepl.org/nrepl/1.1/index.html) on port 2525 by default, and change stuff.\nCall `(clj-reload.core/reload)` to reload code as you change it, if you have a config watcher running it will restart as well.\n\nI have TODOs at the end of each file that you can take a look at.\nI find this easier than making issues.\n\n`master` branch contains the latest stable code, `dev` is where I work on for new changes.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffilipesilva%2Ffdb","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffilipesilva%2Ffdb","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffilipesilva%2Ffdb/lists"}