{"id":13417845,"url":"https://github.com/cameron314/concurrentqueue","last_synced_at":"2025-05-15T01:00:43.549Z","repository":{"id":22906693,"uuid":"26255337","full_name":"cameron314/concurrentqueue","owner":"cameron314","description":"A fast multi-producer, multi-consumer lock-free concurrent queue for C++11","archived":false,"fork":false,"pushed_at":"2025-03-02T01:15:10.000Z","size":5048,"stargazers_count":10881,"open_issues_count":65,"forks_count":1770,"subscribers_count":340,"default_branch":"master","last_synced_at":"2025-04-22T20:08:09.916Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cameron314.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2014-11-06T05:29:05.000Z","updated_at":"2025-04-22T13:27:41.000Z","dependencies_parsed_at":"2025-02-19T03:00:33.297Z","dependency_job_id":"db0a66db-3bc2-47f6-9696-7b4395c3291e","html_url":"https://github.com/cameron314/concurrentqueue","commit_stats":null,"previous_names":[],"tags_count":6,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cameron314%2Fconcurrentqueue","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cameron314%2Fconcurrentqueue/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cameron314%2Fconcurrentqueue/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cameron314%2Fconcurrentqueue/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cameron314","download_url":"https://codeload.github.com/cameron314/concurrentqueue/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252881964,"owners_count":21819148,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-30T22:00:53.859Z","updated_at":"2025-05-07T12:46:46.267Z","avatar_url":"https://github.com/cameron314.png","language":"C++","readme":"# moodycamel::ConcurrentQueue\u003cT\u003e\r\n\r\nAn industrial-strength lock-free queue for C++.\r\n\r\nNote: If all you need is a single-producer, single-consumer queue, I have [one of those too][spsc].\r\n\r\n## Features\r\n\r\n- Knock-your-socks-off [blazing fast performance][benchmarks].\r\n- Single-header implementation. Just drop it in your project.\r\n- Fully thread-safe lock-free queue. Use concurrently from any number of threads.\r\n- C++11 implementation -- elements are moved (instead of copied) where possible.\r\n- Templated, obviating the need to deal exclusively with pointers -- memory is managed for you.\r\n- No artificial limitations on element types or maximum count.\r\n- Memory can be allocated once up-front, or dynamically as needed.\r\n- Fully portable (no assembly; all is done through standard C++11 primitives).\r\n- Supports super-fast bulk operations.\r\n- Includes a low-overhead blocking version (BlockingConcurrentQueue).\r\n- Exception safe.\r\n\r\n## Reasons to use\r\n\r\nThere are not that many full-fledged lock-free queues for C++. Boost has one, but it's limited to objects with trivial\r\nassignment operators and trivial destructors, for example.\r\nIntel's TBB queue isn't lock-free, and requires trivial constructors too.\r\nThere're many academic papers that implement lock-free queues in C++, but usable source code is\r\nhard to find, and tests even more so.\r\n\r\nThis queue not only has less limitations than others (for the most part), but [it's also faster][benchmarks].\r\nIt's been fairly well-tested, and offers advanced features like **bulk enqueueing/dequeueing**\r\n(which, with my new design, is much faster than one element at a time, approaching and even surpassing\r\nthe speed of a non-concurrent queue even under heavy contention).\r\n\r\nIn short, there was a lock-free queue shaped hole in the C++ open-source universe, and I set out\r\nto fill it with the fastest, most complete, and well-tested design and implementation I could.\r\nThe result is `moodycamel::ConcurrentQueue` :-)\r\n\r\n## Reasons *not* to use\r\n\r\nThe fastest synchronization of all is the kind that never takes place. Fundamentally,\r\nconcurrent data structures require some synchronization, and that takes time. Every effort\r\nwas made, of course, to minimize the overhead, but if you can avoid sharing data between\r\nthreads, do so!\r\n\r\nWhy use concurrent data structures at all, then? Because they're gosh darn convenient! (And, indeed,\r\nsometimes sharing data concurrently is unavoidable.)\r\n\r\nMy queue is **not linearizable** (see the next section on high-level design). The foundations of\r\nits design assume that producers are independent; if this is not the case, and your producers\r\nco-ordinate amongst themselves in some fashion, be aware that the elements won't necessarily\r\ncome out of the queue in the same order they were put in *relative to the ordering formed by that co-ordination*\r\n(but they will still come out in the order they were put in by any *individual* producer). If this affects\r\nyour use case, you may be better off with another implementation; either way, it's an important limitation\r\nto be aware of.\r\n\r\nMy queue is also **not NUMA aware**, and does a lot of memory re-use internally, meaning it probably doesn't\r\nscale particularly well on NUMA architectures; however, I don't know of any other lock-free queue that *is*\r\nNUMA aware (except for [SALSA][salsa], which is very cool, but has no publicly available implementation that I know of).\r\n\r\nFinally, the queue is **not sequentially consistent**; there *is* a happens-before relationship between when an element is put\r\nin the queue and when it comes out, but other things (such as pumping the queue until it's empty) require more thought\r\nto get right in all eventualities, because explicit memory ordering may have to be done to get the desired effect. In other words,\r\nit can sometimes be difficult to use the queue correctly. This is why it's a good idea to follow the [samples][samples.md] where possible.\r\nOn the other hand, the upside of this lack of sequential consistency is better performance.\r\n\r\n## High-level design\r\n\r\nElements are stored internally using contiguous blocks instead of linked lists for better performance.\r\nThe queue is made up of a collection of sub-queues, one for each producer. When a consumer\r\nwants to dequeue an element, it checks all the sub-queues until it finds one that's not empty.\r\nAll of this is largely transparent to the user of the queue, however -- it mostly just works\u003csup\u003eTM\u003c/sup\u003e.\r\n\r\nOne particular consequence of this design, however, (which seems to be non-intuitive) is that if two producers\r\nenqueue at the same time, there is no defined ordering between the elements when they're later dequeued.\r\nNormally this is fine, because even with a fully linearizable queue there'd be a race between the producer\r\nthreads and so you couldn't rely on the ordering anyway. However, if for some reason you do extra explicit synchronization\r\nbetween the two producer threads yourself, thus defining a total order between enqueue operations, you might expect\r\nthat the elements would come out in the same total order, which is a guarantee my queue does not offer. At that\r\npoint, though, there semantically aren't really two separate producers, but rather one that happens to be spread\r\nacross multiple threads. In this case, you can still establish a total ordering with my queue by creating\r\na single producer token, and using that from both threads to enqueue (taking care to synchronize access to the token,\r\nof course, but there was already extra synchronization involved anyway).\r\n\r\nI've written a more detailed [overview of the internal design][blog], as well as [the full\r\nnitty-gritty details of the design][design], on my blog. Finally, the\r\n[source][source] itself is available for perusal for those interested in its implementation.\r\n\r\n## Basic use\r\n\r\nThe entire queue's implementation is contained in **one header**, [`concurrentqueue.h`][concurrentqueue.h].\r\nSimply download and include that to use the queue. The blocking version is in a separate header,\r\n[`blockingconcurrentqueue.h`][blockingconcurrentqueue.h], that depends on [`concurrentqueue.h`][concurrentqueue.h] and\r\n[`lightweightsemaphore.h`][lightweightsemaphore.h]. The implementation makes use of certain key C++11 features,\r\nso it requires a relatively recent compiler (e.g. VS2012+ or g++ 4.8; note that g++ 4.6 has a known bug with `std::atomic`\r\nand is thus not supported). The algorithm implementations themselves are platform independent.\r\n\r\nUse it like you would any other templated queue, with the exception that you can use\r\nit from many threads at once :-)\r\n\r\nSimple example:\r\n\r\n```C++\r\n#include \"concurrentqueue.h\"\r\n\r\nmoodycamel::ConcurrentQueue\u003cint\u003e q;\r\nq.enqueue(25);\r\n\r\nint item;\r\nbool found = q.try_dequeue(item);\r\nassert(found \u0026\u0026 item == 25);\r\n```\r\n\r\nDescription of basic methods:\r\n- `ConcurrentQueue(size_t initialSizeEstimate)`\r\n      Constructor which optionally accepts an estimate of the number of elements the queue will hold\r\n- `enqueue(T\u0026\u0026 item)`\r\n      Enqueues one item, allocating extra space if necessary\r\n- `try_enqueue(T\u0026\u0026 item)`\r\n      Enqueues one item, but only if enough memory is already allocated\r\n- `try_dequeue(T\u0026 item)`\r\n      Dequeues one item, returning true if an item was found or false if the queue appeared empty\r\n\r\nNote that it is up to the user to ensure that the queue object is completely constructed before\r\nbeing used by any other threads (this includes making the memory effects of construction\r\nvisible, possibly via a memory barrier). Similarly, it's important that all threads have\r\nfinished using the queue (and the memory effects have fully propagated) before it is\r\ndestructed.\r\n\r\nThere's usually two versions of each method, one \"explicit\" version that takes a user-allocated per-producer or\r\nper-consumer token, and one \"implicit\" version that works without tokens. Using the explicit methods is almost\r\nalways faster (though not necessarily by a huge factor). Apart from performance, the primary distinction between them\r\nis their sub-queue allocation behaviour for enqueue operations: Using the implicit enqueue methods causes an\r\nautomatically-allocated thread-local producer sub-queue to be allocated.\r\nExplicit producers, on the other hand, are tied directly to their tokens' lifetimes (but are recycled internally).\r\n\r\nIn order to avoid the number of sub-queues growing without bound, implicit producers are marked for reuse once\r\ntheir thread exits. However, this is not supported on all platforms. If using the queue from short-lived threads,\r\nit is recommended to use explicit producer tokens instead.\r\n\r\nFull API (pseudocode):\r\n\r\n\t# Allocates more memory if necessary\r\n\tenqueue(item) : bool\r\n\tenqueue(prod_token, item) : bool\r\n\tenqueue_bulk(item_first, count) : bool\r\n\tenqueue_bulk(prod_token, item_first, count) : bool\r\n\t\r\n\t# Fails if not enough memory to enqueue\r\n\ttry_enqueue(item) : bool\r\n\ttry_enqueue(prod_token, item) : bool\r\n\ttry_enqueue_bulk(item_first, count) : bool\r\n\ttry_enqueue_bulk(prod_token, item_first, count) : bool\r\n\t\r\n\t# Attempts to dequeue from the queue (never allocates)\r\n\ttry_dequeue(item\u0026) : bool\r\n\ttry_dequeue(cons_token, item\u0026) : bool\r\n\ttry_dequeue_bulk(item_first, max) : size_t\r\n\ttry_dequeue_bulk(cons_token, item_first, max) : size_t\r\n\t\r\n\t# If you happen to know which producer you want to dequeue from\r\n\ttry_dequeue_from_producer(prod_token, item\u0026) : bool\r\n\ttry_dequeue_bulk_from_producer(prod_token, item_first, max) : size_t\r\n\t\r\n\t# A not-necessarily-accurate count of the total number of elements\r\n\tsize_approx() : size_t\r\n\r\n## Blocking version\r\n\r\nAs mentioned above, a full blocking wrapper of the queue is provided that adds\r\n`wait_dequeue` and `wait_dequeue_bulk` methods in addition to the regular interface.\r\nThis wrapper is extremely low-overhead, but slightly less fast than the non-blocking\r\nqueue (due to the necessary bookkeeping involving a lightweight semaphore).\r\n\r\nThere are also timed versions that allow a timeout to be specified (either in microseconds\r\nor with a `std::chrono` object).\r\n\r\nThe only major caveat with the blocking version is that you must be careful not to\r\ndestroy the queue while somebody is waiting on it. This generally means you need to\r\nknow for certain that another element is going to come along before you call one of\r\nthe blocking methods. (To be fair, the non-blocking version cannot be destroyed while\r\nin use either, but it can be easier to coordinate the cleanup.)\r\n\r\nBlocking example:\r\n\r\n```C++\r\n#include \"blockingconcurrentqueue.h\"\r\n\r\nmoodycamel::BlockingConcurrentQueue\u003cint\u003e q;\r\nstd::thread producer([\u0026]() {\r\n    for (int i = 0; i != 100; ++i) {\r\n        std::this_thread::sleep_for(std::chrono::milliseconds(i % 10));\r\n        q.enqueue(i);\r\n    }\r\n});\r\nstd::thread consumer([\u0026]() {\r\n    for (int i = 0; i != 100; ++i) {\r\n        int item;\r\n        q.wait_dequeue(item);\r\n        assert(item == i);\r\n        \r\n        if (q.wait_dequeue_timed(item, std::chrono::milliseconds(5))) {\r\n            ++i;\r\n            assert(item == i);\r\n        }\r\n    }\r\n});\r\nproducer.join();\r\nconsumer.join();\r\n\r\nassert(q.size_approx() == 0);\r\n```\r\n\r\n## Advanced features\r\n\r\n#### Tokens\r\n\r\nThe queue can take advantage of extra per-producer and per-consumer storage if\r\nit's available to speed up its operations. This takes the form of \"tokens\":\r\nYou can create a consumer token and/or a producer token for each thread or task\r\n(tokens themselves are not thread-safe), and use the methods that accept a token\r\nas their first parameter:\r\n\r\n```C++\r\nmoodycamel::ConcurrentQueue\u003cint\u003e q;\r\n\r\nmoodycamel::ProducerToken ptok(q);\r\nq.enqueue(ptok, 17);\r\n\r\nmoodycamel::ConsumerToken ctok(q);\r\nint item;\r\nq.try_dequeue(ctok, item);\r\nassert(item == 17);\r\n```\r\n\r\nIf you happen to know which producer you want to consume from (e.g. in\r\na single-producer, multi-consumer scenario), you can use the `try_dequeue_from_producer`\r\nmethods, which accept a producer token instead of a consumer token, and cut some overhead.\r\n\r\nNote that tokens work with the blocking version of the queue too.\r\n\r\nWhen producing or consuming many elements, the most efficient way is to:\r\n\r\n1. Use the bulk methods of the queue with tokens\r\n2. Failing that, use the bulk methods without tokens\r\n3. Failing that, use the single-item methods with tokens\r\n4. Failing that, use the single-item methods without tokens\r\n\r\nHaving said that, don't create tokens willy-nilly -- ideally there would be\r\none token (of each kind) per thread. The queue will work with what it is\r\ngiven, but it performs best when used with tokens.\r\n\r\nNote that tokens aren't actually tied to any given thread; it's not technically\r\nrequired that they be local to the thread, only that they be used by a single\r\nproducer/consumer at a time.\r\n\r\n#### Bulk operations\r\n\r\nThanks to the [novel design][blog] of the queue, it's just as easy to enqueue/dequeue multiple\r\nitems as it is to do one at a time. This means that overhead can be cut drastically for\r\nbulk operations. Example syntax:\r\n\r\n```C++\r\nmoodycamel::ConcurrentQueue\u003cint\u003e q;\r\n\r\nint items[] = { 1, 2, 3, 4, 5 };\r\nq.enqueue_bulk(items, 5);\r\n\r\nint results[5];     // Could also be any iterator\r\nsize_t count = q.try_dequeue_bulk(results, 5);\r\nfor (size_t i = 0; i != count; ++i) {\r\n    assert(results[i] == items[i]);\r\n}\r\n```\r\n\r\n#### Preallocation (correctly using `try_enqueue`)\r\n\r\n`try_enqueue`, unlike just plain `enqueue`, will never allocate memory. If there's not enough room in the\r\nqueue, it simply returns false. The key to using this method properly, then, is to ensure enough space is\r\npre-allocated for your desired maximum element count.\r\n\r\nThe constructor accepts a count of the number of elements that it should reserve space for. Because the\r\nqueue works with blocks of elements, however, and not individual elements themselves, the value to pass\r\nin order to obtain an effective number of pre-allocated element slots is non-obvious.\r\n\r\nFirst, be aware that the count passed is rounded up to the next multiple of the block size. Note that the\r\ndefault block size is 32 (this can be changed via the traits). Second, once a slot in a block has been\r\nenqueued to, that slot cannot be re-used until the rest of the block has been completely filled\r\nup and then completely emptied. This affects the number of blocks you need in order to account for the\r\noverhead of partially-filled blocks. Third, each producer (whether implicit or explicit) claims and recycles\r\nblocks in a different manner, which again affects the number of blocks you need to account for a desired number of\r\nusable slots.\r\n\r\nSuppose you want the queue to be able to hold at least `N` elements at any given time. Without delving too\r\ndeep into the rather arcane implementation details, here are some simple formulas for the number of elements\r\nto request for pre-allocation in such a case. Note the division is intended to be arithmetic division and not\r\ninteger division (in order for `ceil()` to work).\r\n\r\nFor explicit producers (using tokens to enqueue):\r\n\r\n```C++\r\n(ceil(N / BLOCK_SIZE) + 1) * MAX_NUM_PRODUCERS * BLOCK_SIZE\r\n```\r\n\r\nFor implicit producers (no tokens):\r\n\r\n```C++\r\n(ceil(N / BLOCK_SIZE) - 1 + 2 * MAX_NUM_PRODUCERS) * BLOCK_SIZE\r\n```\r\n\r\nWhen using mixed producer types:\r\n\r\n```C++\r\n((ceil(N / BLOCK_SIZE) - 1) * (MAX_EXPLICIT_PRODUCERS + 1) + 2 * (MAX_IMPLICIT_PRODUCERS + MAX_EXPLICIT_PRODUCERS)) * BLOCK_SIZE\r\n```\r\n\r\nIf these formulas seem rather inconvenient, you can use the constructor overload that accepts the minimum\r\nnumber of elements (`N`) and the maximum number of explicit and implicit producers directly, and let it do the\r\ncomputation for you.\r\n\r\nIn addition to blocks, there are other internal data structures that require allocating memory if they need to resize (grow).\r\nIf using `try_enqueue` exclusively, the initial sizes may be exceeded, causing subsequent `try_enqueue` operations to fail.\r\nSpecifically, the `INITIAL_IMPLICIT_PRODUCER_HASH_SIZE` trait limits the number of implicit producers that can be active at once\r\nbefore the internal hash needs resizing. Along the same lines, the `IMPLICIT_INITIAL_INDEX_SIZE` trait limits the number of\r\nunconsumed elements that an implicit producer can insert before its internal hash needs resizing. Similarly, the\r\n`EXPLICIT_INITIAL_INDEX_SIZE` trait limits the number of unconsumed elements that an explicit producer can insert before its\r\ninternal hash needs resizing. In order to avoid hitting these limits when using `try_enqueue`, it is crucial to adjust the\r\ninitial sizes in the traits appropriately, in addition to sizing the number of blocks properly as outlined above.\r\n\r\nFinally, it's important to note that because the queue is only eventually consistent and takes advantage of\r\nweak memory ordering for speed, there's always a possibility that under contention `try_enqueue` will fail\r\neven if the queue is correctly pre-sized for the desired number of elements. (e.g. A given thread may think that\r\nthe queue's full even when that's no longer the case.) So no matter what, you still need to handle the failure\r\ncase (perhaps looping until it succeeds), unless you don't mind dropping elements.\r\n\r\n#### Exception safety\r\n\r\nThe queue is exception safe, and will never become corrupted if used with a type that may throw exceptions.\r\nThe queue itself never throws any exceptions (operations fail gracefully (return false) if memory allocation\r\nfails instead of throwing `std::bad_alloc`).\r\n\r\nIt is important to note that the guarantees of exception safety only hold if the element type never throws\r\nfrom its destructor, and that any iterators passed into the queue (for bulk operations) never throw either.\r\nNote that in particular this means `std::back_inserter` iterators must be used with care, since the vector\r\nbeing inserted into may need to allocate and throw a `std::bad_alloc` exception from inside the iterator;\r\nso be sure to reserve enough capacity in the target container first if you do this.\r\n\r\nThe guarantees are presently as follows:\r\n- Enqueue operations are rolled back completely if an exception is thrown from an element's constructor.\r\n  For bulk enqueue operations, this means that elements are copied instead of moved (in order to avoid\r\n  having only some objects moved in the event of an exception). Non-bulk enqueues always use\r\n  the move constructor if one is available.\r\n- If the assignment operator throws during a dequeue operation (both single and bulk), the element(s) are\r\n  considered dequeued regardless. In such a case, the dequeued elements are all properly destructed before\r\n  the exception is propagated, but there's no way to get the elements themselves back.\r\n- Any exception that is thrown is propagated up the call stack, at which point the queue is in a consistent\r\n  state.\r\n\r\nNote: If any of your type's copy constructors/move constructors/assignment operators don't throw, be sure\r\nto annotate them with `noexcept`; this will avoid the exception-checking overhead in the queue where possible\r\n(even with zero-cost exceptions, there's still a code size impact that has to be taken into account).\r\n\r\n#### Traits\r\n\r\nThe queue also supports a traits template argument which defines various types, constants,\r\nand the memory allocation and deallocation functions that are to be used by the queue. The typical pattern\r\nto providing your own traits is to create a class that inherits from the default traits\r\nand override only the values you wish to change. Example:\r\n\r\n```C++\r\nstruct MyTraits : public moodycamel::ConcurrentQueueDefaultTraits\r\n{\r\n\tstatic const size_t BLOCK_SIZE = 256;\t\t// Use bigger blocks\r\n};\r\n\r\nmoodycamel::ConcurrentQueue\u003cint, MyTraits\u003e q;\r\n```\r\n\r\n#### How to dequeue types without calling the constructor\r\n\r\nThe normal way to dequeue an item is to pass in an existing object by reference, which\r\nis then assigned to internally by the queue (using the move-assignment operator if possible).\r\nThis can pose a problem for types that are\r\nexpensive to construct or don't have a default constructor; fortunately, there is a simple\r\nworkaround: Create a wrapper class that copies the memory contents of the object when it\r\nis assigned by the queue (a poor man's move, essentially). Note that this only works if\r\nthe object contains no internal pointers. Example:\r\n\r\n```C++\r\nstruct MyObjectMover {\r\n    inline void operator=(MyObject\u0026\u0026 obj) {\r\n        std::memcpy(data, \u0026obj, sizeof(MyObject));\r\n        \r\n        // TODO: Cleanup obj so that when it's destructed by the queue\r\n        // it doesn't corrupt the data of the object we just moved it into\r\n    }\r\n\r\n    inline MyObject\u0026 obj() { return *reinterpret_cast\u003cMyObject*\u003e(data); }\r\n\r\nprivate:\r\n    align(alignof(MyObject)) char data[sizeof(MyObject)];\r\n};\r\n```\r\n\r\nA less dodgy alternative, if moves are cheap but default construction is not, is to use a\r\nwrapper that defers construction until the object is assigned, enabling use of the move\r\nconstructor:\r\n\r\n```C++\r\nstruct MyObjectMover {\r\n    inline void operator=(MyObject\u0026\u0026 x) {\r\n        new (data) MyObject(std::move(x));\r\n        created = true;\r\n    }\r\n\r\n    inline MyObject\u0026 obj() {\r\n        assert(created);\r\n        return *reinterpret_cast\u003cMyObject*\u003e(data);\r\n    }\r\n\r\n    ~MyObjectMover() {\r\n        if (created)\r\n            obj().~MyObject();\r\n    }\r\n\r\nprivate:\r\n    align(alignof(MyObject)) char data[sizeof(MyObject)];\r\n    bool created = false;\r\n};\r\n```\r\n\r\n## Samples\r\n\r\nThere are some more detailed samples [here][samples.md]. The source of\r\nthe [unit tests][unittest-src] and [benchmarks][benchmark-src] are available for reference as well.\r\n\r\n## Benchmarks\r\n\r\nSee my blog post for some [benchmark results][benchmarks] (including versus `boost::lockfree::queue` and `tbb::concurrent_queue`),\r\nor run the benchmarks yourself (requires MinGW and certain GnuWin32 utilities to build on Windows, or a recent\r\ng++ on Linux):\r\n\r\n```Shell\r\ncd build\r\nmake benchmarks\r\nbin/benchmarks\r\n```\r\n\r\nThe short version of the benchmarks is that it's so fast (especially the bulk methods), that if you're actually\r\nusing the queue to *do* anything, the queue won't be your bottleneck.\r\n\r\n## Tests (and bugs)\r\n\r\nI've written quite a few unit tests as well as a randomized long-running fuzz tester. I also ran the\r\ncore queue algorithm through the [CDSChecker][cdschecker] C++11 memory model model checker. Some of the\r\ninner algorithms were tested separately using the [Relacy][relacy] model checker, and full integration\r\ntests were also performed with Relacy.\r\nI've tested\r\non Linux (Fedora 19) and Windows (7), but only on x86 processors so far (Intel and AMD). The code was\r\nwritten to be platform-independent, however, and should work across all processors and OSes.\r\n\r\nDue to the complexity of the implementation and the difficult-to-test nature of lock-free code in general,\r\nthere may still be bugs. If anyone is seeing buggy behaviour, I'd like to hear about it! (Especially if\r\na unit test for it can be cooked up.) Just open an issue on GitHub.\r\n\t\r\n## Using vcpkg\r\nYou can download and install `moodycamel::ConcurrentQueue` using the [vcpkg](https://github.com/Microsoft/vcpkg) dependency manager:\r\n\r\n```Shell\r\ngit clone https://github.com/Microsoft/vcpkg.git\r\ncd vcpkg\r\n./bootstrap-vcpkg.sh\r\n./vcpkg integrate install\r\nvcpkg install concurrentqueue\r\n```\r\n\t\r\nThe `moodycamel::ConcurrentQueue` port in vcpkg is kept up to date by Microsoft team members and community contributors. If the version is out of date, please [create an issue or pull request](https://github.com/Microsoft/vcpkg) on the vcpkg repository.\r\n\r\n## License\r\n\r\nI'm releasing the source of this repository (with the exception of third-party code, i.e. the Boost queue\r\n(used in the benchmarks for comparison), Intel's TBB library (ditto), CDSChecker, Relacy, and Jeff Preshing's\r\ncross-platform semaphore, which all have their own licenses)\r\nunder a simplified BSD license. I'm also dual-licensing under the Boost Software License.\r\nSee the [LICENSE.md][license] file for more details.\r\n\r\nNote that lock-free programming is a patent minefield, and this code may very\r\nwell violate a pending patent (I haven't looked), though it does not to my present knowledge.\r\nI did design and implement this queue from scratch.\r\n\r\n## Diving into the code\r\n\r\nIf you're interested in the source code itself, it helps to have a rough idea of how it's laid out. This\r\nsection attempts to describe that.\r\n\r\nThe queue is formed of several basic parts (listed here in roughly the order they appear in the source). There's the\r\nhelper functions (e.g. for rounding to a power of 2). There's the default traits of the queue, which contain the\r\nconstants and malloc/free functions used by the queue. There's the producer and consumer tokens. Then there's the queue's\r\npublic API itself, starting with the constructor, destructor, and swap/assignment methods. There's the public enqueue methods,\r\nwhich are all wrappers around a small set of private enqueue methods found later on. There's the dequeue methods, which are\r\ndefined inline and are relatively straightforward.\r\n\r\nThen there's all the main internal data structures. First, there's a lock-free free list, used for recycling spent blocks (elements\r\nare enqueued to blocks internally). Then there's the block structure itself, which has two different ways of tracking whether\r\nit's fully emptied or not (remember, given two parallel consumers, there's no way to know which one will finish first) depending on where it's used.\r\nThen there's a small base class for the two types of internal SPMC producer queues (one for explicit producers that holds onto memory\r\nbut attempts to be faster, and one for implicit ones which attempt to recycle more memory back into the parent but is a little slower).\r\nThe explicit producer is defined first, then the implicit one. They both contain the same general four methods: One to enqueue, one to\r\ndequeue, one to enqueue in bulk, and one to dequeue in bulk. (Obviously they have constructors and destructors too, and helper methods.)\r\nThe main difference between them is how the block handling is done (they both use the same blocks, but in different ways, and map indices\r\nto them in different ways).\r\n\r\nFinally, there's the miscellaneous internal methods: There's the ones that handle the initial block pool (populated when the queue is constructed),\r\nand an abstract block pool that comprises the initial pool and any blocks on the free list. There's ones that handle the producer list\r\n(a lock-free add-only linked list of all the producers in the system). There's ones that handle the implicit producer lookup table (which\r\nis really a sort of specialized TLS lookup). And then there's some helper methods for allocating and freeing objects, and the data members\r\nof the queue itself, followed lastly by the free-standing swap functions.\r\n\r\n\r\n[blog]: http://moodycamel.com/blog/2014/a-fast-general-purpose-lock-free-queue-for-c++\r\n[design]: http://moodycamel.com/blog/2014/detailed-design-of-a-lock-free-queue\r\n[samples.md]: https://github.com/cameron314/concurrentqueue/blob/master/samples.md\r\n[source]: https://github.com/cameron314/concurrentqueue\r\n[concurrentqueue.h]: https://github.com/cameron314/concurrentqueue/blob/master/concurrentqueue.h\r\n[blockingconcurrentqueue.h]: https://github.com/cameron314/concurrentqueue/blob/master/blockingconcurrentqueue.h\r\n[lightweightsemaphore.h]: https://github.com/cameron314/concurrentqueue/blob/master/lightweightsemaphore.h\r\n[unittest-src]: https://github.com/cameron314/concurrentqueue/tree/master/tests/unittests\r\n[benchmarks]: http://moodycamel.com/blog/2014/a-fast-general-purpose-lock-free-queue-for-c++#benchmarks\r\n[benchmark-src]: https://github.com/cameron314/concurrentqueue/tree/master/benchmarks\r\n[license]: https://github.com/cameron314/concurrentqueue/blob/master/LICENSE.md\r\n[cdschecker]: http://demsky.eecs.uci.edu/c11modelchecker.html\r\n[relacy]: http://www.1024cores.net/home/relacy-race-detector\r\n[spsc]: https://github.com/cameron314/readerwriterqueue\r\n[salsa]: http://webee.technion.ac.il/~idish/ftp/spaa049-gidron.pdf\r\n","funding_links":[],"categories":["TODO scan for Android support in followings","Concurrency","C++","C/C++ 程序设计","Coding","Libraries","Projects","Software"],"sub_categories":["网络服务_其他","C++ Data Structures and Algorithms","Threading","Communication (RPC/threading/serialization)"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcameron314%2Fconcurrentqueue","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcameron314%2Fconcurrentqueue","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcameron314%2Fconcurrentqueue/lists"}