{"id":15725863,"url":"https://github.com/ndrean/caching_an_api","last_synced_at":"2025-03-31T01:27:01.521Z","repository":{"id":124424263,"uuid":"488025146","full_name":"ndrean/caching_an_api","owner":"ndrean","description":"Elixir cluster on Kubernetes featuring Mnesia, Ets and Redix","archived":false,"fork":false,"pushed_at":"2022-05-25T23:10:27.000Z","size":434,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-02-06T06:46:12.247Z","etag":null,"topics":["cluster","elixir-lang","ets","kubernetes","libcluster","mnesia-cluster"],"latest_commit_sha":null,"homepage":"","language":"Elixir","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ndrean.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-05-02T23:53:53.000Z","updated_at":"2024-08-30T19:13:13.000Z","dependencies_parsed_at":"2023-08-07T16:37:54.980Z","dependency_job_id":null,"html_url":"https://github.com/ndrean/caching_an_api","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ndrean%2Fcaching_an_api","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ndrean%2Fcaching_an_api/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ndrean%2Fcaching_an_api/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ndrean%2Fcaching_an_api/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ndrean","download_url":"https://codeload.github.com/ndrean/caching_an_api/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246402583,"owners_count":20771341,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cluster","elixir-lang","ets","kubernetes","libcluster","mnesia-cluster"],"created_at":"2024-10-03T22:24:39.624Z","updated_at":"2025-03-31T01:27:01.496Z","avatar_url":"https://github.com/ndrean.png","language":"Elixir","readme":"# CachingAnApi\n\nTo illustrate the usage of different in-build stores, we cache responses to HTTP calls with different solutions: (a GenServer), an Ets data store and a Mnesia database in the case of a distributed cluster and a CRDT solution.\n\n\u003e Other unused options here would rely on external databases, such as Redis with PubSub or Postgres with Listen/Notify.\n\nThere is a module Api for performing dummy HTTP requests. It calls a Cache module.\nWe put two options:\n\n- [master]: Cache is just a module that distributes the read/Write to the request data store. Mnesia is a GenServer: with Mnesia system event, it triggers Mnesia cluster startup and update.\n\n- [mnesia-no-gs]: Cache is a GenServer that uses Erlang's node monitoring to triger Mnesia start and cluter update. Then the Mnesia module is just a wrapper.\n\nYou can configure which store is used: the state of the Cache GenServer, Ets or Mnesia w/o disc persistance or CRDT. Set the `store: type` with `:mn` or `:ets` or `crdt` or `store: nil` (for the process state). Also set `disc_copy` to `:disc_copy` or `nil` if your want persistance on each node or not.\n\nEtsDb in just a module that wraps Ets, and Mnesia is/or not a supervised GenServer since we want to handle network partition.\n\n## The stores\n\n- [ETS](https://www.erlang.org/doc/man/ets.html)\nIt is an in-build in-memory key-value data store localized in a node and it's life shelf is dependant upon the process that created it. In this case, the app: when we kill all the nodes, this data is lost, which is a wanted feature here.\nThis data store is **not distributed**: other nodes within a cluster can't access to it.\nData is saved with tuples and there is no need to serialize values.\nSince we launch Ets in it's own process, we used the flag `:public`. Any process can thus read and write from the Ets database. The operations have to be made atomic to avoid race conditions (for example, no write and then read within the same function as this could lead to inconsistancies). It then offers shared, concurrent read access to data (meaning scaling upon the number of CPUs used).\n\nA word about [performance between GenServer and Ets](\u003chttps://prying.io/technical/2019/09/01/caching-options-in-an-elixir-application.html\u003e).\n\n\u003e Check for the improved [ConCache](https://github.com/sasa1977/con_cache) with TTL support.\n\n- [Mnesia](http://erlang.org/documentation/doc-5.2/pdf/mnesia-4.1.pdf)\nMnesia is an in-build distributed in-memory and optionnally  disc persisted database build (node-based) for concurrency. It works both in memory (**with Ets**) and on disc. As Ets, it stores tuples.\nYou can define tables whose structure is defined by a record type.\nIn Mnesia, actions are wrapped within a **transaction**: if something goes wrong with executing a transaction, it will be rolled back and nothing will be saved on the database. This means the operations are `:atomic`,  meaning that all operations should occur or no operations should occur in case of an error. The disc persistance is optional in Mnesia. Set `disc_copy: :disc_copy` or to `nil` in the \"config.exs\".\n\n  - storage capacity: from the [doc](https://www.erlang.org/faq/mnesia.html), it is indicated that:\n    - for ram_copies and disc_copies, the entire table is kept in memory, so data size is limited by available RAM.\n    - for disc_copies tables, the entire table needs to be read from disk to memory on node startup, which can take a long time for large table.\n\n\u003e What's the point of using Mnesia? If you need to keep a database that will be used by multiple processes and/or nodes, using Mnesia means you don't have to write your own access controls.\n\u003e Furthermore, a [word about scalability performance of Mnesia](http://www.dcs.gla.ac.uk/~amirg/publications/DE-Bench.pdf) and [here](https://stackoverflow.com/questions/5044574/how-scalable-is-distributed-erlang) and [here](https://stackoverflow.com/questions/5044574/how-scalable-is-distributed-erlang).\n\n- [CRDT](https://github.com/derekkraan/delta_crdt_ex)\nDeltaCrdt implements a key/value store using concepts from Delta CRDTs. A CRDT can be used as a distributed temporary caching mechanism that is synced across our Cluster. A good introduction to [CRDT](https://moosecode.nl/blog/how_deltacrdt_can_help_write_distributed_elixir_applications).\n\n## The Erlang cluster\n\nIn an Erlang cluster, all nodes are fully connected, with N(N-1)/2 \u003c=\u003e O(N^2) TCP/IP connections.\nA [word](http://dcs.gla.ac.uk/~natalia/sd-erlang-improving-jpdc-16.pdf) on full P2P Erlang clusters. The performance plateau at 40 nodes and do not scale beyond 60 nodes.\n\nTo create a cluster, from an **IEX** session, you need to pass a name to connect the nodes and pass the same cookie to each node.\n\n### Launch the nodes\n\n- [name] Use the flag `--sname` (for short name, within the *same* network) and it will assign **:\"a@your-local-system-name\"**. If you are not running in the same network, use instead the flag `--name` with a qualified domain, such as **:\"a@127.0.0.1\"** or **:\"a@example.com\"**.\n\n```elixir\n# term 1\n\u003e iex --sname a --cookie :my_secret -S mix\niex(a@MacBookND)\u003e\n# or\n\u003e iex --name A@127.0.0.1  --cookie :my_secret -S  mix\niex(A@127.0.0.1)\u003e\n```\n\nSo to launch 3 nodes, run in 3 separate terminals:\n\n```elixir\n#t1\n\u003e iex --name A@127.0.0.1  --cookie :my_secret -S  mix\n#t2\n\u003e iex --name A@127.0.0.1  --cookie :my_secret -S  mix\n#t3\n\u003e iex --name A@127.0.0.1  --cookie :my_secret -S  mix\n```\n\n#### Automatic launch of IEX sessions in new terminals\n\nOn MacOS, `chmod +x` the following:\n\n```bash\n# launcher.sh\n# ! /bin/bash\nfor i in a b c d\ndo\n    osascript -e \"tell application \\\"Terminal\\\" to do script \\\"iex --sname \"$i\" -S mix\\\"\"\ndone\n```\n\nAlternatively, use [ttab](https://www.npmjs.com/package/ttab)\n\n```bash\nhost=\"@127.0.0.1\"\nfor i in a1 b1 c1\ndo\n  ttab iex name \"$i$host\" -S mix\nend\n```\n\n### Connect the nodes\n\n- [connect] Thanks to the **transitivty** of the BEAM connections,  you just need to connect one node to the N-1 others to get the full P2P network of N(N-1)/2 TPC connections.\n\n#### Manual connection\n\nWithin one terminal, say t1, run:\n\n```elixir\niex(A@127.0.0.1)\u003e for l \u003c- [\"A\",\"B\",\"C\"], do: String.to_atom(l\u003c\u003e \"@127.0.0.1\") |\u003e Node.connect()\n[true,true,true,true ]\n\n# check with:\niex(A@127.0.0.1)\u003e :net.ping(:\"C@127.0.0.1\")\n:pong \niex(B@127.0.0.1)\u003e :net.ping(:\"D@127.0.0.1\")\n:pong \n```\n\nWith **code**, if you want to connect two machines \"a@node\" and \"b@node\" with respective IP of 192.168.0.1 and 192.168.0.2, then you would do:\n\n```elixir\n# on the \"a@node@ machine\nNode.start :\"a@192.168.0.1\"\nNode.set_cookie :my_secret\nNode.connect \"b@192.168.0.2\"\n\n# on the \"b@node\" machine\nNode.start :\"b@192.168.0.2\"\nNode.set_cookie :my_secret\n\n# from A@node:\nNode.connect(:\"b@192.168.0.1\")\nNode.list()\n[:\"b@192.168.0.1\"]\n#from b@node\nNode.list()\n[:\"a@192.168.0.1\"]\n```\n\nTo disconnect a node from another, run:\n\n```elixir\niex(:a@127.0.0.1)\u003e Node.disconnect(:\"b@127.0.0.1\")\n```\n\n[TODO] With the compiled code:\n\n```bash\n# file vm.args contains name and cookie\n./bin/...\n```\n\n#### Automatic Cluster connection: `libcluster`\n\nFor the automatic clusterisation, you can use use `libcluster` in `epmd` mode (IP based) or `gossip` mode (DNS based).\n\n\u003e With the `epmd` mode, you need to pass a first host as a config. To be tested between different domains, not only localhost ??\n\nWith the `epmd` mode, you can `b@node\u003e Node.disconnect(A)` from another node B and need to manually reconnect with `a@node\u003e Node.connect(B)`, both A and B will restart since Mnesia detects a partition, and this is captured here to restart the node with a fresh table. With the `gossip` mode, an attempt to disconnect will trigger a fresh restart on the caller and calling nodes.\n\n- set the `gossip` topology for automatic DNS. The setting are in \"config/config.exs\".\n- use `:net_kernel.monitor_nodes(true)`in `GenServer.start_link` to discover the nodes\n\n### Remote-Procedure-Call between nodes\n\n\u003e see OPT25 with module `peer`\n\nYou can execute a function on a reomte node. You can use `:rpc.call` or `GenServer.call` from a remote node (if you call a function within a GenServer)\n\nIf you have:\n\n```elixir\niex(:a@127.0.0.1)\u003e EtsDb.get(1)\n\"a\"\niex(:b@127.0.0.1)\u003e EtsDb.get(1)\n\"b\"\n```\n\nthen you see:\n\n```elixir\niex(:a@127.0.0.1)\u003e :rpc.call(:\"b@127.0.0.1\", EtsDb, :get, [1] )\n\"a\"\niex(:c@127.0.0.1)\u003e for node \u003c- Node.list(), do: {node, :rpc.call(node, EtsDb, :get, [1])}\n[\"a@127.0.0.1\": \"a\", \"b@127.0.0.1\": \"b\"]\n```\n\nSuppose we have a client function `Module.nodes` implemented with a callback `:nodes` within a GenServer, then you can use `GenServer.call` to run a remote function on another node (be careful with the construction of the functions with the brackets \"}\").\n\n```elixir\niex(c@127.0.0.1)\u003e GenServer.call({MnDb, :\"b@127.0.0.1\"}, {:node_list})\n[:\"a@127.0.0.1\", :\"c@127.0.0.1\"]\n\niex(c@127.0.0.1)\u003e for node \u003c- Node.list(), do: {node, GenServer.call({MnDb, node}, {:nodes}) }\n[\n  \"a@127.0.0.1\": [:\"b@127.0.0.1\", :\"c@127.0.0.1\"],\n  \"b@127.0.0.1\": [:\"a@127.0.0.1\", :\"c@127.0.0.1\"]\n]\n\n# or use `multicall`-\u003e {sucess, failure}\niex(:c@127.0.0.1)\u003e :rpc.multicall(EtsDb, :get, [1])\n{[\"a@127.0.0.1\": \"a\", \"b@127.0.0.1\": \"b\"], []}\n```\n\n## Ets\n\nSome documents about the data store: [Elixir-lang-org: Ets](https://elixir-lang.org/getting-started/mix-otp/ets.html), and [Elixir school: Ets](https://elixirschool.com/en/lessons/storage/ets) and an excellent article talking about [Ets in production](https://sayan.xyz/posts/elixir-erlang-and-ets-alchemy).\n\nSome useful commands:\n\n- creation: just use `:ets.new`\n- read/write: `:ets.lookup` and `:ets.insert` to respectively \"get\" and \"put\"\n- read all data of the table \":ecache\": `:ets.tab2list(:ecache)`\n\nTo check that the EtsDb GenServer module is supervised\n\n```elixir\n[Info] Ets cache up: ecache\niex\u003e Process.whereis(EtsDb)\n#PID\u003c0.339.0\u003e\niex\u003e |\u003e Process.exit(:shutdown)\n:ok\n[Info] Ets cache up: ecache\niex\u003e Process.whereis(EtsDb)\n#PID\u003c0.344.0\u003e\n```\n\n## Mnesia\n\n### Documentation / Sources\n\nThe [Mnesia](http://erlang.org/documentation/doc-5.2/pdf/mnesia-4.1.pdf) documentation and the [Elixir school lesson](https://elixirschool.com/en/lessons/storage/mnesia). Also [LearnYouSomeErlang](https://learnyousomeerlang.com/mnesia#whats-mnesia).\n\nUsefull libraries:\n\n- [library Mnesiac](https://github.com/beardedeagle/mnesiac/blob/master/lib/mnesiac/store_manager.ex)\n- [Library Amensia](https://github.com/meh/amnesia)\n\nOther [nice source](https://mbuffa.github.io/tips/20201111-elixir-troubleshooting-mnesia/) or [here](https://www.welcometothejungle.com/fr/articles/redis-mnesia-distributed-database) and a bit about [amensia](https://code.tutsplus.com/articles/store-everything-with-elixir-and-mnesia--cms-29821).\n\n### Configuration\n\nAll you need is to give **names** to tables and a **folder location** for each node for the disc copies.\n\n\u003e the documentation says that \"the directory must be UNIQUE for each node. \"Two nodes must never share the same directory\".\n\nYou can add a node specific name in the \"config/confi.exs\" file. For example: `config :mnesia, dir: 'mndb_#{Node.self()}'`. The \"config/config.exs\" is used at **build time**, before compilation and dependencies loading).\nIf the folder doesn't exist, it will be created.\n\nMnesia can be started in code with `:mnesia.start()`. We can add `:mnesia` in the MixProject application `included_application` to remove the VSCode warnings. Not adding it in `extra_application` is mandatory in single node mode since we need to create the schema before starting Mnesia.\n\n### Mnesia system event handler\n\nWe use the Mnesia system event handler by declaring `:mnesia.subscribe(:system)`. We have a `handle_info` call in the Cache module to log the message.\n\n### Single node mode startup\n\n\u003e DONT add `:mnesia` in the MixProject application `:extra_applications` since you will need to start it manually. Instead, add `included_applications: [:mnesia]`. This will also remove the warnings in VSCode. The reason is that you need to firstly create the schema (meaning you create the database), and only then start Mnesia.\n\nThe sequence is:  `:mnesia.create_schema` to create a new database, then  `:mnesia.start()`, then `:mnesia.create_table` where you specify the rows and also that you want a disc copy for your node. The parameter `disc_copies: [node()]` means that data is stored both on disc and in the memory. Finally, the disc copy directory can be specified in the `config.exs` file.\n\n### Distributed Mnesia startup\n\nThe sequence is:\n\n- start Mnesia. Two options: declare `[extra_applications: [:mnesia]` in MixProject  or use `:mnesia.start()`.\n- connect nodes and inform Mnesia that other nodes belong to the cluster,\n- ensure that data (schema and table) are stored on disc. Two copy functions are used, depending if it's the schema or table.\n\n## Debug\n\nUse `:mnesia.system_info()` to inspect Mnesia in an IEX session. You can also extract info using args. You can also use it in code.\n\n```bash\niex\u003e :mnesia.system_info()\n[...]\niex\u003e :mnesia.system_info(:running_db_nodes)\n'[:a@127.0.0.1, :b@127.0.0.1]'\niex\u003e :mnesia.system_info(:directory)\n'.../mndb_test@mycomputer'\niex\u003e :mnesia.table_info(:mcache, :attributes)\n[:post_id, :data]\n```\n\nYou can run `:ets.tab2list(:mcache)` in a node and this displays the whole Mnesia table which is in RAM.\n\nTo inspect a **GenServer** state, you can use Erlang's `:sys.get_state(genserver_pid)`. We can get the pid with `Process.whereis(Cache)` since we named it.\n\nIn the code, you can add `IO.inspect(value, label: \"check 1\")` or `IO.inspect(binding())` (for a function arguments). Also `Logger.info(\"#{inspect(state)}\")`.\n\n## RESULTS\n\nUsed `benchee` to run `mix run lib/caching_an_api/benchmark.exs`.\n\n- Cached. The cache is populated with the first pass of the slowest, `yield_many_asynced_stream`).\n\nComparison:\nstream_synced                    2.88 K\nenum_yield_many                  1.63 K - 1.76x slower +264.55 μs\nasynced_stream                   1.02 K - 2.82x slower +633.34 μs\nyield_many_asynced_stream        1.00 K - 2.87x slower +651.90 μs\n\n## Misc notes\n\n## Actor model vs Object-Orientated\n\n**Objects** enscapsulate state and interact with **functions**. **Encapsulation** dictates that the internal data of an object is not accessible directly from the outside; it can only be modified by invoking a set of curated methods. The object is responsible for exposing safe operations that protect the invariant nature of its encapsulated data. Since functions are executed with threads, and since encapsulation only guarantee for single-threaded access, you need to add mechanisms such as **locks**.\n\n**Actors** interact with **message** passing. They have their own state, the **behavior**, a function that defines how to react to messages.\nInstead of calling methods like objects do, actors *receive* and *send* messages to each other. Sending a message does not transfer the thread of execution from the sender to the destination. An actor can send a message and continue without blocking. Message-passing in actor systems is fundamentally **asynchronous**, i.e. message transmission and reception do not have to happen at the same time, and senders may transmit messages before receivers are ready to accept them. Messages go into actor  **mailboxes**. Actors execute independently from the senders of a message, and they react to incoming messages sequentially, one at a time. While each actor processes messages sent to it sequentially, different actors work concurrently with each other so that an actor system can process as many messages simultaneously as the hardware will support.\n\nAn important difference between passing messages and calling methods is that messages have no return value. By sending a message, an actor delegates work to another actor.\n**Actors** react to messages just like **objects** react to methods invoked on them.\n\n### Elixir notes\n\n[handle_continue](https://elixirschool.com/blog/til-genserver-handle-continue/)\n\n[GernServer stop](https://alexcastano.com/how-to-stop-a-genserver-in-elixir/)\n\n[Handling events](https://mkaszubowski.com/2021/01/09/elixir-event-handling.html)\n### Production release\n\nTake a look at [Render](https://render.com/docs/deploy-elixir-cluster) and [Gigalixir](https://gigalixir.com/#/about) and [fly.io](https://fly.io/docs/getting-started/elixir/)\n\n```bash\nmix phx.gen.secret\nxxxx\nexport SCRET_KEY_BASE=\"xxxx\"\nMIX_ENV=prod mix setup\nMIX_ENV=prod mix release\n```\n\n### Enum vs Stream\n\n`Stream` evaluates the functions of the chain for each enumerable, whereas `Enum` evaluates each enumerable then performs the next function of the chain.\n\n```elixir\niex\u003e [1,2,3]|\u003e Stream.map(\u0026IO.inspect/1) |\u003e Stream.map(\u0026IO.inspect/1) |\u003e Enum.to_list\n1,1,2,2,3,3,\niex\u003e  [1,2,3]|\u003e Enum.map(\u0026IO.inspect/1) |\u003e Enum.map(\u0026IO.inspect/1)\n1,2,3,1,2,3\n```\n\n### Bakeware\n\n\u003chttps://www.youtube.com/watch?v=ML5hQjPQL7A\u003e\n\n\u003chttps://github.com/bake-bake-bake/bakeware\u003e\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fndrean%2Fcaching_an_api","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fndrean%2Fcaching_an_api","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fndrean%2Fcaching_an_api/lists"}