{"id":16701442,"url":"https://github.com/ferd/pobox","last_synced_at":"2025-04-09T18:18:59.906Z","repository":{"id":8877206,"uuid":"10592818","full_name":"ferd/pobox","owner":"ferd","description":"External buffer processes to protect against mailbox overflow in Erlang","archived":false,"fork":false,"pushed_at":"2018-12-05T15:24:10.000Z","size":216,"stargazers_count":319,"open_issues_count":2,"forks_count":35,"subscribers_count":20,"default_branch":"master","last_synced_at":"2025-04-09T18:18:54.022Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Erlang","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ferd.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2013-06-10T03:06:33.000Z","updated_at":"2025-03-23T02:07:33.000Z","dependencies_parsed_at":"2022-08-24T13:00:15.660Z","dependency_job_id":null,"html_url":"https://github.com/ferd/pobox","commit_stats":null,"previous_names":[],"tags_count":9,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ferd%2Fpobox","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ferd%2Fpobox/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ferd%2Fpobox/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ferd%2Fpobox/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ferd","download_url":"https://codeload.github.com/ferd/pobox/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248085325,"owners_count":21045139,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-12T18:44:05.686Z","updated_at":"2025-04-09T18:18:59.888Z","avatar_url":"https://github.com/ferd.png","language":"Erlang","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Build Status](https://travis-ci.com/ferd/pobox.svg?branch=master)](https://travis-ci.com/ferd/pobox)\n\n# PO Box\n\nHigh throughput Erlang applications often get bitten by the fact that\nErlang mailboxes are unbounded and will keep accepting messages until the\nnode runs out of memory.\n\nIn most cases, this problem can be solved by imposing a rate limit on\nthe producers, and it is recommended to explore this idea before looking\nat this library.\n\nWhen it is impossible to rate-limit the messages coming to a process and\nthat your optimization efforts remain fruitless, you need to start\nshedding load by dropping messages.\n\nPO Box can help by shedding the load for you, and making sure you won't\nrun out of memory.\n\n## The Principles\n\nPO Box is a library that implements a buffer process.  Erlang processes\nwill receive their messages locally (at home), and may become overloaded\nbecause they have to both deal with their mailbox and day-to-day tasks:\n\n             messages\n                |\n                V\n    +-----[Pid or Name]-----+\n    |      |         |      |\n    |      | mailbox |      |\n    |      +---------+      |\n    |       |               |\n    |    receive            |\n    +-----------------------+\n\nA PO Box process will be where you will ask for your messages to go\nthrough. The PO Box process will implement a buffer (see *Types of\nBuffer* for details) that will do nothing but churn through messages and\ndrop them when the buffer is full for you.\n\nDepending on how you use the API, the PO Box can tell you it received new data,\nso you can then ask for the data, or you can tell it to  send the data to you\ndirectly, without notification:\n\n                                                      messages\n                                                         |\n                                                         V\n    +---------[Pid]---------+                +--------[POBox]--------+\n    |                       |\u003c-- got mail ---|      |         |      |\n    |                       |                |      | mailbox |      |\n    |   \u003cimportant stuff\u003e   |--- send it! --\u003e|      +---------+      |\n    |                       |                |       |               |\n    |                       |\u003c---\u003cmessages\u003e--|\u003c---buffer             |\n    +-----------------------+                +-----------------------+\n\nTo be more detailed, a PO Box is a state machine with an owner process\n(which it receives messages for), and it has 3 states:\n\n- Active\n- Notify\n- Passive\n\nThe passive state basically does nothing but accumulate messages in the\nbuffer and drop them when necessary.\n\nThe notify state is enabled by the user by calling the PO Box. Its sole\ntask is to verify if there is any message in the buffer. If there is, it\nwill respond to the PO Box's owner with a `{mail, BoxPid, new_data}` message\nsent directly to the pid. If there is no message in the buffer, the\nprocess will wait in the notify state until it gets one. As soon as the\nnotification is sent, it reverts back to the passive state.\n\nThe active state is the only one that can send actual messages to the\nowner process. The user can call the PO Box to set it active, and if\nthere are any messages in the buffer, all the messages it contains get\nsent as a list to the owner. If there are no messages, the PO Box waits\nuntil there is one to send it. After forwarding the messages, the PO Box\nreverts to the passive state.\n\nThe FSM can be illustrated as crappy ASCII as:\n\n             ,----\u003e[passive]------(user makes active)-----\u003e[active]\n             |         | ^                                  |  ^  |\n             |         | '---(sends message to user)--\u003c-----'  |  |\n             |  (user makes notify)                            |  |\n             |         |                                       |  |\n    (user is notified) |                                       |  |\n             |         V                                       |  |\n             '-----[notify]---------(user makes active)--------'  |\n                         ^----------(user makes notify)\u003c----------'\n\n## Types of buffer\n\nCurrently, there are three types of built-in buffers supported: queues\nand stacks, and `keep_old` queues. You can also provide your own\nbuffer implementation using the `pobox_buf` behaviour. See\n`samples/pobox_queue_buf.erl` for an example implementation.\n\nQueues will keep messages in order, and drop oldest messages to make\nplace for new ones. If you have a buffer of size 3 and receive messages\na, b, c, d, e in that order, the buffer will contain messages `[c,d,e]`.\n\n`keep_old` queues will keep messages in order, but block newer messages\nfrom entering, favoring keeping old messages instead. If you have a\nbuffer of size 3 and receive messages a, b, c, d, e in that order, the\nbuffer will contain messages `[a,b,c]`.\n\nStacks will not guarantee any message ordering, and will drop the top of\nthe stack to make place for the new messages first. for the same\nmessages, the stack buffer should contain the messages `[e,b,a]`.\n\nTo choose between a queue and a stack buffer, you should consider the\nfollowing criterias:\n\n- Do you need messages in order? Choose one of the queues.\n- Do you need the latest messages coming in to be kept, or the oldest\n  ones? If so, pick `queue` and `keep_old`, respectively.\n- Do you need low latency? Then choose a stack. Stacks will give you\n  many messages with low latency with a few with high latency. Queues\n  will give you a higher overall latency, but less variance over time.\n\nMore buffer types could be supported in the future, if people require\nthem.\n\n## How to build it\n\n    ./rebar compile\n\n## How to run tests\n\n    ./rebar compile ct\n\n## How to use it\n\nStart a buffer with any of the following:\n\n    start_link(OwnerPid, MaxSize, BufferType)\n    start_link(OwnerPid, MaxSize, BufferType, InitialState)\n    start_link(Name, OwnerPid, MaxSize, BufferType)\n    start_link(Name, OwnerPid, MaxSize, BufferType, InitialState)\n    start_link(#{\n        name =\u003e Name,\n        owner =\u003e OwnerPid,\n        max =\u003e MaxSize, %% mandatory\n        type =\u003e BufferType,\n        initial_state =\u003e InitialState,\n        heir =\u003e HeirPid,\n        heir_data =\u003e HeirData\n    })\n    start_link(Name, #{\n        owner =\u003e OwnerPid,\n        max =\u003e MaxSize, %% mandatory\n        type =\u003e BufferType,\n        initial_state =\u003e InitialState,\n        heir =\u003e Heir,\n        heir_data =\u003e HeirData\n    })\nWhere:\n\n- `Name` is any name a regular `gen_fsm` process can accept (including\n  `{via,...}` tuples)\n- `OwnerPid` is the pid of the PO Box owner. It's the only one that can\n  communicate with it in terms of setting state and reading messages.\n  The `OwnerPid` can be either a pid or an atom. The PO Box will set up\n  a link directly between itself and `OwnerPid`, and won't trap exits.\n  If you're using named processes (atoms) and want to have the PO Box\n  survive them individually, you should unlink the processes manually.\n  This also means that processes that terminate normally won't kill the\n  POBox.\n- `MaxSize` is the maximum number of messages in a buffer. Note that this\n  is a mandatory property.\n- `BufferType` can be either `queue`, `stack` or `keep_old` and specifies\n  which type is going to be used. You can also provide your buffer module\n  using `{mod, Module}`.\n- `InitialState` can be either `passive` or `notify`. The default value\n  is set to `notify`. Having the buffer passive is desirable when you\n  start it during an asynchronous `init` and do not want to receive\n  notifications right away.\n- `Heir` The name or pid of a process that will take over the PO Box if\n  the owner dies. You can use a local registered name such as an atom or\n  you can also use `{global, Name}` and `{via, Module, Name}` if you wish.\n  If the Heir is a name the name is resolved as soon as the Owner dies.\n  A message will be sent to the heir to notify it of the transfer and the\n  PO Box will be put into passive state. The format of this message should\n  look like `{pobox_transfer, BoxPid, PreviousOwnerPid, HeirData, Reason}`.\n- `HeirData` is data that should be sent as part of the pobox_transfer\n   message to the heir when the owner dies.\n   \nThe buffer can be made active by calling:\n\n    pobox:active(BoxPid, FilterFun, FilterState)\n\nThe `FilterFun` is a function that will take messages one by one along\nwith custom state and can return:\n\n- `{{ok, Message}, NewState}`: the message will be sent.\n- `{drop, NewState}`: the message will be dropped.\n- `skip`: the message is left in the buffer and whatever was filtered so\n  far gets sent.\n\nA function that would blindly forward all messages could be written as:\n\n    fun(Msg, _) -\u003e {{ok,Msg},nostate} end\n\nA function that would limit binary messages by size could be written as:\n\n    fun(Msg, Allowed) -\u003e\n        case Allowed - byte_size(Msg) of\n            N when N \u003c 0 -\u003e skip;\n            N -\u003e {{ok, Msg}, N}\n        end\n    end\n\nOr you could drop messages that are empty binaries by doing:\n\n    fun(\u003c\u003c\u003e\u003e, State) -\u003e {drop, State};\n       (Msg, State) -\u003e {{ok,Msg}, State}\n    end.\n\nThe resulting message sent will be:\n\n    {mail, BoxPid, Messages, MessageCount, MessageDropCount}\n\nFinally, the PO Box can be forced to notify by calling:\n\n    pobox:notify(BoxPid)\n\nWhich is objectively much simpler.\n\nMessages can be sent to a PO Box by calling `pobox:post(BoxPid, Msg)` or\nsending a message directly to the process as `BoxPid ! {post, Msg}`.\n\nThe ownership of the PO Box can be transfered to another process by calling:\n\n    pobox:give_away(BoxPid, DestPid, DestData, Timeout)\n    \nor\n\n    pobox:give_away(BoxPid, DestPid, Timeout)\n    \nwhich is equivalent to:\n\n    pobox_give_away(BoxPid, DestPid, undefined, Timeout)\n\nThe call should return `true` on success and `false` on failure. Note that\nyou can only call this from within the owner process otherwise the call always fails.\nIf `DestData` is not provided it will be sent as `undefined` in the `pobox_transfer`\nmessage.\n\nThe destination process should receive a message of the following form:\n\n    {pobox_transfer, BoxPid, PreviousOwnerPid, DestData | undefined, give_away}\n\n## Example Session\n\nFirst start a PO Box for the current process:\n\n    1\u003e {ok, Box} = pobox:start_link(self(), 10, queue).\n    {ok,\u003c0.39.0\u003e}\n\nWe'll also define a spammer function that will just keep mailing a bunch\nof messages:\n\n    2\u003e Spam = fun(F,N) -\u003e pobox:post(Box,N), F(F,N+1) end.\n    #Fun\u003cerl_eval.12.17052888\u003e\n\nBecause we're in the shell, the function takes itself as an argument so\nit can both remain anonymous and loop. Each message is an increasing\ninteger.\n\nI can start the process and wait for a while:\n\n    3\u003e Spammer = spawn(fun() -\u003e Spam(Spam,0) end).\n    \u003c0.42.0\u003e\n\nLet's see if we have anything in our PO box:\n\n    4\u003e flush().\n    Shell got {mail, \u003c0.39.0\u003e, new_data}\n    ok\n\nYes! Let's get that content:\n\n    5\u003e pobox:active(Box, fun(X,ok) -\u003e {{ok,X},ok} end, ok).\n    ok\n    6\u003e flush().\n    Shell got {mail,\u003c0.39.0\u003e,\n                    [778918,778919,778920,778921,778922,778923,778924,778925,\n                     778926,778927],\n                    10,778918}\n    ok\n\nSo we have 10 messages with seqential IDs (we used a queue buffer), and\nthe process kindly dropped over 700,000 messages for us, keeping our\nnode's memory safe.\n\nThe spammer is still going and our PO Box is in passive mode. Let's cut\nto the chase and go directly to the active state:\n\n    7\u003e pobox:active(Box, fun(X,ok) -\u003e {{ok,X},ok} end, ok).\n    ok\n    8\u003e flush().\n    Shell got {mail,\u003c0.39.0\u003e,\n                    [1026883,1026884,1026885,1026886,1026887,1026888,1026889,\n                     1026890,1026891,1026892],\n                    10,247955}\n    ok\n\nNice. We can go back to notification mode too:\n\n    9\u003e pobox:notify(Box).\n    ok\n    10\u003e flush().\n    Shell got {mail, \u003c0.39.0\u003e, new_data}\n    ok\n\nAnd keep going on and on and on.\n\n## Notes\n\n- Be careful to have a lightweight filter function if you expect constant\n  overload from messages that keep coming very very fast. While the\n  buffer filters out whatever messages you have, the new ones keep\n  accumulating in the PO Box's own mailbox!\n- It is possible for a process to have multiple PO Boxes, although\n  coordinating the multiple state machines together may get tricky.\n- The library is a generalization of ideas designed and implemented in\n  logplex by Geoff Cant's (@archaelus). Props to him.\n- Using a `keep_old` buffer with a filter function that selects one message\n  at a time would be equivalent to a naive bounded mailbox similar to what\n  plenty of users asked for before. Tricking the filter function to\n  forward the message (`self() ! Msg`) while dropping it will allow\n  to do selective receives on bounded mailboxes.\n- When using `post_sync/3` keep in mind that full doesn't mean your message\n  will be dropped unless you are using the `keep_old` buffer type or\n  a custom buffer that behaves the same way as `keep_old`.\n\n## Contributing\n\nAccepted contributions need to be non-aesthetic, and provide some new\nfunctionality, fix abstractions, improve performance or semantics, and\nso on.\n\nAll changes received must be tested and not break existing tests.\n\nChanges to currently untested functionality should ideally first provide\na separate commit that shows the current behaviour working with the new\ntests (or some of the new tests, if you expand on the functionality),\nand then your own feature (and additional tests if required) in its own\ncommit so we can verify nothing breaks in unpredictable ways.\n\nTests are written using Common Test. PropEr tests will be accepted,\nbecause they objectively rule. Ideally, you will wrap your PropEr tests\nin a Common Test suite so we can run everything with one command.\n\nIf you need help, feel free to ask for it in issues or pull requests.\nThese rules are strict, but we're nice people!\n\n## Roadmap\n\nThis is more a wishlist than a roadmap, in no particular order:\n\n- Provide default filter functions in a new module\n\n## Changelog\n- 1.2.0: added heir and `give_away` functionality / fixed `keep_old` buffer size tracking\n- 1.1.0: added `pobox_buf` behaviour to add custom buffer implementations\n- 1.0.4: move to gen\\_statem implementation to avoid OTP 21 compile errors and OTP 20 warnings\n- 1.0.3: fix typespecs to generate fewer errors\n- 1.0.2: explicitly specify `registered` to be `[]` for\n         relx compatibility, switch to rebar3\n- 1.0.1: fixing bug where manually dropped messages (with the active filter)\n         would result in wrong size values and crashes for queues.\n- 1.0.0: A PO Box links itself to the process that it receives data for.\n- 0.2.0: Added PO Box's pid in the `newdata` message so a process can own more\n         than a PO Box. Changed internal queue and stack size monitoring to be\n         O(1) in all cases.\n- 0.1.1: adding `keep_old` queue, which blocks new messages from entering\n         a filled queue.\n- 0.1.0: initial commit\n\n## Authors / Thanks\n\n- Fred Hebert / @ferd: library generalization and current implementation\n- Geoff Cant / @archaelus: design, original implementation\n- Jean-Samuel Bédard / @jsbed: adaptation to gen\\_statem behaviour\n- Eric des Courtis / @edescourtis: added `pobox_buf` behaviour \u0026 heir/give\\_away functionality\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fferd%2Fpobox","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fferd%2Fpobox","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fferd%2Fpobox/lists"}