{"id":21274175,"url":"https://github.com/endurodave/xallocator","last_synced_at":"2025-04-14T11:50:22.660Z","repository":{"id":113550127,"uuid":"108658706","full_name":"endurodave/xallocator","owner":"endurodave","description":"C++ Fixed Block Memory Allocator","archived":false,"fork":false,"pushed_at":"2025-02-18T16:03:28.000Z","size":55,"stargazers_count":8,"open_issues_count":0,"forks_count":2,"subscribers_count":0,"default_branch":"master","last_synced_at":"2025-03-28T00:51:17.600Z","etag":null,"topics":["cpp","embedded-cpp","memory-allocator"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/endurodave.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-10-28T15:03:01.000Z","updated_at":"2024-12-30T22:26:48.000Z","dependencies_parsed_at":"2024-12-24T20:19:07.909Z","dependency_job_id":"81c9e7af-1fb6-4c95-80db-4b0e1df6b4b1","html_url":"https://github.com/endurodave/xallocator","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/endurodave%2Fxallocator","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/endurodave%2Fxallocator/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/endurodave%2Fxallocator/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/endurodave%2Fxallocator/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/endurodave","download_url":"https://codeload.github.com/endurodave/xallocator/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248877802,"owners_count":21176236,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cpp","embedded-cpp","memory-allocator"],"created_at":"2024-11-21T09:19:15.100Z","updated_at":"2025-04-14T11:50:22.619Z","avatar_url":"https://github.com/endurodave.png","language":"C++","readme":"![License MIT](https://img.shields.io/github/license/BehaviorTree/BehaviorTree.CPP?color=blue)\n[![conan Ubuntu](https://github.com/endurodave/xallocator/actions/workflows/cmake_ubuntu.yml/badge.svg)](https://github.com/endurodave/xallocator/actions/workflows/cmake_ubuntu.yml)\n[![conan Ubuntu](https://github.com/endurodave/xallocator/actions/workflows/cmake_clang.yml/badge.svg)](https://github.com/endurodave/xallocator/actions/workflows/cmake_clang.yml)\n[![conan Windows](https://github.com/endurodave/xallocator/actions/workflows/cmake_windows.yml/badge.svg)](https://github.com/endurodave/xallocator/actions/workflows/cmake_windows.yml)\n\n# Replace malloc/free with a Fast Fixed Block Memory Allocator\n\nReplace malloc/free with xmalloc/xfree is faster than the global heap and prevents heap fragmentation faults.\n\n# Table of Contents\n\n- [Replace malloc/free with a Fast Fixed Block Memory Allocator](#replace-mallocfree-with-a-fast-fixed-block-memory-allocator)\n- [Table of Contents](#table-of-contents)\n- [Preface](#preface)\n- [Introduction](#introduction)\n- [Storage Recycling](#storage-recycling)\n- [Heap vs. Pool](#heap-vs-pool)\n- [xallocator](#xallocator)\n- [Overload new and delete](#overload-new-and-delete)\n- [Code Implementation](#code-implementation)\n- [Hiding the Allocator Pointer](#hiding-the-allocator-pointer)\n- [Porting Issues](#porting-issues)\n- [Reducing Slack](#reducing-slack)\n- [Allocator vs. xallocator](#allocator-vs-xallocator)\n- [Benchmarking](#benchmarking)\n- [Reference articles](#reference-articles)\n- [Conclusion](#conclusion)\n\n\n# Preface\n\nOriginally published on CodeProject at: \u003ca href=\"https://www.codeproject.com/Articles/1084801/Replace-malloc-free-with-a-Fast-Fixed-Block-Memory\"\u003e\u003cstrong\u003eReplace malloc/free with a Fast Fixed Block Memory Allocator\u003c/strong\u003e\u003c/a\u003e\n\n\u003cp\u003e\u003ca href=\"https://www.cmake.org/\"\u003eCMake\u003c/a\u003e\u0026nbsp;is used to create the build files. CMake is free and open-source software. Windows, Linux and other toolchains are supported. See the \u003cstrong\u003eCMakeLists.txt \u003c/strong\u003efile for more information.\u003c/p\u003e\n\n# Introduction\n\n\u003cp\u003eCustom fixed block allocators are specialized memory managers used to solve performance problems with the global heap. In the article \u0026quot;\u003cb\u003e\u003ca href=\"http://www.codeproject.com/Articles/1083210/An-efficient-Cplusplus-fixed-block-memory-allocato\"\u003eAn Efficient C++ Fixed Block Memory Allocator\u003c/a\u003e\u003c/b\u003e\u0026quot;, I implemented an allocator class to improve speed and eliminate the possibility of a fragmented heap memory fault. In this latest article, the \u003ccode\u003eAllocator\u003c/code\u003e class is used as a basis for the \u003ccode\u003exallocator\u003c/code\u003e implementation to replace \u003ccode\u003emalloc()\u003c/code\u003e and \u003ccode\u003efree()\u003c/code\u003e.\u003c/p\u003e\n\n\u003cp\u003eUnlike most fixed block allocators, the \u003ccode\u003exallocator\u003c/code\u003e implementation is capable of running in a completely dynamic fashion without advanced knowledge of block sizes or block quantity. The allocator takes care of all the fixed block management for you. It is completely portable to any PC-based or embedded system. In addition, it offers insight into your dynamic usage with memory statistics.\u003c/p\u003e\n\n\u003cp\u003eIn this article, I replace the C library \u003ccode\u003emalloc\u003c/code\u003e/\u003ccode\u003efree\u003c/code\u003e with alternative fixed memory block versions \u003ccode\u003exmalloc()\u003c/code\u003e and \u003ccode\u003exfree()\u003c/code\u003e. First, I\u0026#39;ll briefly explain the underlying \u003ccode\u003eAllocator\u003c/code\u003e storage recycling method, then present how \u003ccode\u003exallocator\u003c/code\u003e works.\u003c/p\u003e\n\n# Storage Recycling\n\n\u003cp\u003eThe basic philosophy of the memory management scheme is to recycle memory obtained during object allocations. Once storage for an object has been created, it\u0026#39;s never returned to the heap. Instead, the memory is recycled, allowing another object of the same type to reuse the space. I\u0026#39;ve implemented a class called \u003ccode\u003eAllocator\u003c/code\u003e that expresses the technique.\u003c/p\u003e\n\n\u003cp\u003eWhen the application deletes using \u003ccode\u003eAllocator\u003c/code\u003e, the memory block for a single object is freed for use again but is not actually released back to the memory manager. Freed blocks are retained in a linked list, called the free-list, to be doled out again for another object of the same type. On every allocation request, \u003ccode\u003eAllocator\u003c/code\u003e first checks the free-list for an existing memory block. Only if none are available is a new one created. Depending on the desired behavior of \u003ccode\u003eAllocator\u003c/code\u003e, storage comes from either the global heap or a static memory pool with one of three operating modes:\u003c/p\u003e\n\n\u003col\u003e\n\t\u003cli\u003eHeap blocks\u003c/li\u003e\n\t\u003cli\u003eHeap pool\u003c/li\u003e\n\t\u003cli\u003eStatic pool\u003c/li\u003e\n\u003c/ol\u003e\n\n# Heap vs. Pool\n\n\u003cp\u003eThe \u003ccode\u003eAllocator\u003c/code\u003e class is capable of creating new blocks from the heap or a memory pool whenever the free-list cannot provide an existing one. If the pool is used, you must specify the number of objects up front. Using the total objects, a pool large enough to handle the maximum number of instances is created. Obtaining block memory from the heap, on the other hand, has no such quantity limitations \u0026ndash; construct as many new objects as storage permits.\u003c/p\u003e\n\n\u003cp\u003eThe \u003cem\u003eheap blocks\u003c/em\u003e mode allocates from the global heap a new memory block for a single object as necessary to fulfill memory requests. A deallocation puts the block into a free-list for later reuse. Creating fresh new blocks off the heap when the free-list is empty frees you from having to set an object limit. This approach offers dynamic-like operation since the number of blocks can expand at run-time. The disadvantage is a loss of deterministic execution during block creation.\u003c/p\u003e\n\n\u003cp\u003eThe \u003cem\u003eheap pool\u003c/em\u003e mode creates a single pool from the global heap to hold all blocks. The pool is created using operator new when the \u003ccode\u003eAllocator\u003c/code\u003e object is constructed. \u003ccode\u003eAllocator\u003c/code\u003e then provides blocks of memory from the pool during allocations.\u003c/p\u003e\n\n\u003cp\u003eThe \u003cem\u003estatic pool\u003c/em\u003e mode uses a single memory pool, typically located in static memory, to hold all blocks. The static memory pool is not created by \u003ccode\u003eAllocator\u003c/code\u003e but instead is provided by the user of the class.\u003c/p\u003e\n\n\u003cp\u003eThe heap pool and static pool modes offers consistent allocation execution times because the memory manager is never involved with obtaining individual blocks. This makes a new operation very fast and deterministic.\u003c/p\u003e\n\n\u003cp\u003eThe \u003ccode\u003eAllocator\u003c/code\u003e constructor controls the mode of operation.\u003c/p\u003e\n\n\u003cpre lang=\"C++\"\u003e\nclass Allocator\n{\npublic:\n    Allocator(size_t size, UINT objects=0, CHAR* memory=NULL, const CHAR* name=NULL);\n...\u003c/pre\u003e\n\n\u003cp\u003eRefer to \u0026quot;\u003cb\u003e\u003ca href=\"http://www.codeproject.com/Articles/1083210/An-efficient-Cplusplus-fixed-block-memory-allocato\"\u003eAn Efficient C++ Fixed Block Memory Allocator\u003c/a\u003e\u003c/b\u003e\u0026quot; for more information on \u003ccode\u003eAllocator\u003c/code\u003e.\u003c/p\u003e\n\n# xallocator\n\n\u003cp\u003eThe \u003ccode\u003exallocator\u003c/code\u003e module has six main APIs:\u003c/p\u003e\n\n\u003cul\u003e\n\t\u003cli\u003e\u003ccode\u003exmalloc\u003c/code\u003e\u003c/li\u003e\n\t\u003cli\u003e\u003ccode\u003exfree\u003c/code\u003e\u003c/li\u003e\n\t\u003cli\u003e\u003ccode\u003exrealloc\u003c/code\u003e\u003c/li\u003e\n\t\u003cli\u003e\u003ccode\u003exalloc_stats\u003c/code\u003e\u003c/li\u003e\n\t\u003cli\u003e\u003ccode\u003exalloc_init\u003c/code\u003e\u003c/li\u003e\n\t\u003cli\u003e\u003ccode\u003exalloc_destroy\u003c/code\u003e\u003c/li\u003e\n\u003c/ul\u003e\n\n\u003cp\u003e\u003ccode\u003exmalloc()\u003c/code\u003e is equivalent to \u003ccode\u003emalloc()\u003c/code\u003e and used in exactly the same way. Given a number of bytes, the function returns a pointer to a block of memory the size requested.\u003c/p\u003e\n\n\u003cpre lang=\"C++\"\u003e\nvoid* memory1 = xmalloc(100);\u003c/pre\u003e\n\n\u003cp\u003eThe memory block is at least as large as the user request, but could actually be more due to the fixed block allocator implementation. The additional over allocated memory is called slack but with fine-tuning block size, the waste is minimized, as I\u0026#39;ll explain later in the article.\u003c/p\u003e\n\n\u003cp\u003e\u003ccode\u003exfree()\u003c/code\u003e is the CRT equivalent of \u003ccode\u003efree()\u003c/code\u003e. Just pass \u003ccode\u003exfree()\u003c/code\u003e a pointer to a previously allocated \u003ccode\u003exmalloc()\u003c/code\u003e block to free the memory for reuse.\u003c/p\u003e\n\n\u003cpre lang=\"C++\"\u003e\nxfree(memory1);\u003c/pre\u003e\n\n\u003cp\u003e\u003ccode\u003exrealloc()\u003c/code\u003e behaves the same as \u003ccode\u003erealloc()\u003c/code\u003e in that it expands or contracts the memory block while preserving the memory block contents.\u003c/p\u003e\n\n\u003cpre lang=\"C++\"\u003e\nchar* memory2 = (char*)xmalloc(24);    \nstrcpy(memory2, \u0026quot;TEST STRING\u0026quot;);\nmemory2 = (char*)xrealloc(memory2, 124);\nxfree(memory2);\u003c/pre\u003e\n\n\u003cp\u003e\u003ccode\u003exalloc_stats()\u003c/code\u003e outputs allocator usage statistics to the standard output stream. The output provides insight into how many \u003ccode\u003eAllocator\u003c/code\u003e instances are being used, blocks in use, block sizes, and more.\u003c/p\u003e\n\n\u003cp\u003e\u003ccode\u003exalloc_init()\u003c/code\u003e must be called one time before any worker threads start, or in the case of an embedded system, before the OS starts. On a C++ application, this function is called automatically for you. However, it is desirable to call \u003ccode\u003exalloc_init()\u003c/code\u003e manually is some cases, typically on an embedded system to avoid the small memory overhead involved with the automatic \u003ccode\u003exalloc_init()\u003c/code\u003e/\u003ccode\u003exalloc_destroy()\u003c/code\u003e call mechanism.\u003c/p\u003e\n\n\u003cp\u003e\u003ccode\u003exalloc_destroy()\u003c/code\u003e is called when the application exits to clean up any dynamically allocated resources. On a C++ application, this function is called automatically when the application terminates. You must never call \u003ccode\u003exalloc_destroy()\u003c/code\u003e manually except in programs that use \u003ccode\u003exallocator\u003c/code\u003e only within C files.\u003c/p\u003e\n\n\u003cp\u003eNow, when to call \u003ccode\u003exalloc_init()\u003c/code\u003e and \u003ccode\u003exalloc_destroy()\u003c/code\u003e within a C++ application is not so easy. The problem arises with \u003ccode\u003estatic\u003c/code\u003e objects. If \u003ccode\u003exalloc_destroy()\u003c/code\u003e is called too early, \u003ccode\u003exallocator\u003c/code\u003e may still be needed when a \u003ccode\u003estatic\u003c/code\u003e object destructor get called at program exit. Take for instance this class:\u003c/p\u003e\n\n\u003cpre lang=\"C++\"\u003e\nclass MyClassStatic\n{\npublic:\n    MyClassStatic() \n    { \n        memory = xmalloc(100); \n    }\n    ~MyClassStatic() \n    { \n        xfree(memory); \n    }\nprivate:\n    void* memory;\n};\u003c/pre\u003e\n\n\u003cp\u003eNow create a \u003ccode\u003estatic\u003c/code\u003e instance of this class at file scope.\u003c/p\u003e\n\n\u003cpre lang=\"C++\"\u003e\nstatic MyClassStatic myClassStatic;\u003c/pre\u003e\n\n\u003cp\u003eSince the object is \u003ccode\u003estatic\u003c/code\u003e, the \u003ccode\u003eMyClassStatic\u003c/code\u003e constructor will be called before \u003ccode\u003emain()\u003c/code\u003e, which is okay as I\u0026rsquo;ll explain in the \u0026ldquo;Porting issues\u0026rdquo; section below. However, the destructor is called after \u003ccode\u003emain()\u003c/code\u003e exits which is not okay if not handled correctly. The problem becomes how to determine when to destroy the \u003ccode\u003exallocator\u003c/code\u003e dynamically allocated resources. If \u003ccode\u003exalloc_destroy()\u003c/code\u003e is called before \u003ccode\u003emain()\u003c/code\u003e exits, \u003ccode\u003exallocator\u003c/code\u003e will already be destroyed when \u003ccode\u003e~MyClassStatic()\u003c/code\u003e tries to call \u003ccode\u003exfree()\u003c/code\u003e causing a bug.\u003c/p\u003e\n\n\u003cp\u003eThe key to the solution comes from a guarantee in the C++ Standard:\u003c/p\u003e\n\n\u003cblockquote class=\"quote\"\u003e\n\u003cdiv class=\"op\"\u003eQuote:\u003c/div\u003e\n\n\u003cp\u003e\u0026ldquo;Objects with static storage duration defined in namespace scope in the same translation unit and dynamically initialized shall be initialized in the order in which their definition appears in the translation unit.\u0026rdquo;\u003c/p\u003e\n\u003c/blockquote\u003e\n\n\u003cp\u003eIn other words, \u003ccode\u003estatic\u003c/code\u003e object constructors are called in the same order as defined within the file (translation unit). The destruction will reverse that order. Therefore, \u003cem\u003exallocator.h\u003c/em\u003e defines a \u003ccode\u003eXallocInitDestroy\u003c/code\u003e class and creates a \u003ccode\u003estatic\u003c/code\u003e instance of it.\u003c/p\u003e\n\n\u003cpre lang=\"C++\"\u003e\nclass XallocInitDestroy\n{\npublic:\n    XallocInitDestroy();\n    ~XallocInitDestroy();\nprivate:\n    static INT refCount;\n};\nstatic XallocInitDestroy xallocInitDestroy;\u003c/pre\u003e\n\n\u003cp\u003eThe constructor keeps track of the total number of \u003ccode\u003estatic\u003c/code\u003e instances created and calls \u003ccode\u003exalloc_init()\u003c/code\u003e on the first construction.\u003c/p\u003e\n\n\u003cpre lang=\"C++\"\u003e\nINT XallocInitDestroy::refCount = 0;\nXallocInitDestroy::XallocInitDestroy() \n{ \n    // Track how many static instances of XallocInitDestroy are created\n    if (refCount++ == 0)\n        xalloc_init();\n}\u003c/pre\u003e\n\n\u003cp\u003eThe destructor calls \u003ccode\u003exalloc_destroy()\u003c/code\u003e automatically when the last instance is destroyed.\u003c/p\u003e\n\n\u003cpre lang=\"C++\"\u003e\nXallocDestroy::~XallocDestroy()\n{\n    // Last static instance to have destructor called?\n    if (--refCount == 0)\n        xalloc_destroy();\n}\u003c/pre\u003e\n\n\u003cp\u003eWhen including \u003cem\u003exallocator.h\u003c/em\u003e in a translation unit, \u003ccode\u003exallocInitDestroy\u003c/code\u003e will be declared first since the \u003ccode\u003e#include\u003c/code\u003e comes before user code. Meaning any other \u003ccode\u003estatic\u003c/code\u003e user classes relying on \u003ccode\u003exallocator\u003c/code\u003e will be declared after \u003ccode\u003e#include \u0026ldquo;xallocator.h\u003c/code\u003e\u0026rdquo;. This guarantees that \u003ccode\u003e~XallocInitDestroy()\u003c/code\u003e is called after all user \u003ccode\u003estatic\u003c/code\u003e classes destructors are executed. Using this technique, \u003ccode\u003exalloc_destroy()\u003c/code\u003e is safely called when the program exits without danger of having \u003ccode\u003exallocator\u003c/code\u003e destroyed prematurely.\u003c/p\u003e\n\n\u003cp\u003e\u003ccode\u003eXallocInitDestroy\u003c/code\u003e is an empty class and therefore is 1-byte in size. The cost of this feature is then 1-byte for every translation unit that includes \u003cem\u003exallocator.h\u003c/em\u003e with the following exceptions.\u003c/p\u003e\n\n\u003col\u003e\n\t\u003cli\u003eOn an embedded system where the application never exits, the technique is not required \u003cem\u003eexcept\u003c/em\u003e if \u003ccode\u003eSTATIC_POOLS\u003c/code\u003e mode is used. All references to \u003ccode\u003eXallocInitDestroy\u003c/code\u003e can be safety removed and \u003ccode\u003exalloc_destroy()\u003c/code\u003e need never be called. However, you must now call \u003ccode\u003exalloc_init()\u003c/code\u003e manually in \u003ccode\u003emain()\u003c/code\u003e before the \u003ccode\u003exallocator\u003c/code\u003e API is used.\u0026nbsp;\u003c/li\u003e\n\t\u003cli\u003eWhen \u003ccode\u003exallocator\u003c/code\u003e is included within a C translation unit, a \u003ccode\u003estatic\u003c/code\u003e instance of \u003ccode\u003eXallocInitDestroy\u003c/code\u003e is not created. In this case, you must call \u003ccode\u003exalloc_init()\u003c/code\u003e in main() and\u003ccode\u003e xalloc_destroy() \u003c/code\u003ebefore main() exits.\u0026nbsp;\u003c/li\u003e\n\u003c/ol\u003e\n\n\u003cp\u003eTo enable or disable automatic \u003ccode\u003exallocator\u003c/code\u003e initialization and destruction, use the \u003ccode\u003e#define\u003c/code\u003e below:\u003c/p\u003e\n\n\u003cpre lang=\"C++\"\u003e\n#define AUTOMATIC_XALLOCATOR_INIT_DESTROY \u003c/pre\u003e\n\n\u003cp\u003eOn a PC or similarly equipped high RAM system, this 1-byte is insignificant and in return ensures safe \u003ccode\u003exallocator\u003c/code\u003e operation in \u003ccode\u003estatic\u003c/code\u003e class instances during program exit. It also frees you from having to call \u003ccode\u003exalloc_init()\u003c/code\u003e and\u003ccode\u003e xalloc_destroy()\u003c/code\u003e as this is handled automatically.\u0026nbsp;\u003c/p\u003e\n\n# Overload new and delete\n\n\u003cp\u003eTo make the \u003ccode\u003exallocator\u003c/code\u003e really easy to use, I\u0026#39;ve created a macro to overload the \u003ccode\u003enew\u003c/code\u003e/\u003ccode\u003edelete\u003c/code\u003e within a class and route the memory request to \u003ccode\u003exmalloc()\u003c/code\u003e/\u003ccode\u003exfree()\u003c/code\u003e. Just add the macro \u003ccode\u003eXALLOCATOR\u003c/code\u003e anywhere in your class definition.\u003c/p\u003e\n\n\u003cpre lang=\"C++\"\u003e\nclass MyClass \n{\n    XALLOCATOR\n    // remaining class definition\n};\u003c/pre\u003e\n\n\u003cp\u003eUsing the macro, a \u003ccode\u003enew\u003c/code\u003e/\u003ccode\u003edelete\u003c/code\u003e of your class routes the request to \u003ccode\u003exallocator\u003c/code\u003e by way of the overloaded \u003ccode\u003enew\u003c/code\u003e/\u003ccode\u003edelete\u003c/code\u003e.\u003c/p\u003e\n\n\u003cpre lang=\"C++\"\u003e\n// Allocate MyClass using fixed block allocator\nMyClass* myClass = new MyClass();\ndelete myClass;\u003c/pre\u003e\n\n\u003cp\u003eA neat trick is to place\u0026nbsp;\u003ccode\u003eXALLOCATOR\u003c/code\u003e within the\u0026nbsp;base class of an inheritance\u0026nbsp;hierarchy so that all derived classes\u0026nbsp;allocate/deallocate\u0026nbsp;using \u003ccode\u003exallocator\u003c/code\u003e. For instance, say you had a GUI library with a base class.\u003c/p\u003e\n\n\u003cpre lang=\"c++\"\u003e\nclass GuiBase \n{\n    XALLOCATOR\n    // remaining class definition\n};\u003c/pre\u003e\n\n\u003cp\u003eAny \u003ccode\u003eGuiBase\u003c/code\u003e derived class (buttons, widgets, etc...)\u0026nbsp;now uses\u0026nbsp;\u003ccode\u003exallocator\u003c/code\u003e when \u003ccode\u003enew\u003c/code\u003e/\u003ccode\u003edelete\u003c/code\u003e is called without having to add \u003ccode\u003eXALLOCATOR\u003c/code\u003e\u0026nbsp;to every derived class. This is a powerful means to enable fixed block allocations for an entire hierarchy with a single macro statement.\u003c/p\u003e\n\n# Code Implementation\n\n\u003cp\u003e\u003ccode\u003exallocator\u003c/code\u003e relies upon multiple \u003ccode\u003eAllocator\u003c/code\u003e instances to manage the fixed blocks; each \u003ccode\u003eAllocator\u003c/code\u003e instance handles one block size. Like \u003ccode\u003eAllocator\u003c/code\u003e, \u003ccode\u003exallocator\u003c/code\u003e is designed to operate in heap blocks or static pool modes. The mode is controlled by the \u003ccode\u003eSTATIC_POOLS\u003c/code\u003e define within \u003ccode\u003exallocator.cpp\u003c/code\u003e.\u003c/p\u003e\n\n\u003cpre lang=\"C++\"\u003e\n#define STATIC_POOLS    // Static pools mode enabled\u003c/pre\u003e\n\n\u003cp\u003eIn heap blocks mode, \u003ccode\u003exallocator\u003c/code\u003e creates both \u003ccode\u003eAllocator\u003c/code\u003e instances and new blocks dynamically at runtime based upon the requested block sizes. By default, \u003ccode\u003exallocator\u003c/code\u003e uses powers of two block sizes. 8, 16, 32, 64, 128, etc... This way, \u003ccode\u003exallocator\u003c/code\u003e doesn\u0026#39;t need to know the block sizes in advance and offers the utmost flexibility.\u003c/p\u003e\n\n\u003cp\u003eThe maximum number of \u003ccode\u003eAllocator\u003c/code\u003e instances dynamically created by \u003ccode\u003exallocator\u003c/code\u003e is controlled by \u003ccode\u003eMAX_ALLOCATORS\u003c/code\u003e. Increase or decrease this number as necessary for your target application.\u003c/p\u003e\n\n\u003cpre lang=\"C++\"\u003e\n#define MAX_ALLOCATORS  15\u003c/pre\u003e\n\n\u003cp\u003eIn static pools\u0026nbsp;mode, \u003ccode\u003exallocator\u003c/code\u003e relies upon\u0026nbsp;\u003ccode\u003eAllocator\u003c/code\u003e instances created during dynamic initialization (before entering \u003ccode\u003emain()\u003c/code\u003e)\u0026nbsp;and static memory pools to satisfy memory requests. This eliminates all heap access with the tradeoff being the block sizes and pools are of fixed size and cannot expand at runtime.\u003c/p\u003e\n\n\u003cp\u003eUsing \u003ccode\u003eAllocator\u003c/code\u003e\u0026nbsp;initialization in \u003cfont color=\"#990000\" face=\"Consolas, Courier New, Courier, mono\"\u003e\u003cspan style=\"font-size: 14.66px\"\u003eSTATIC_POOLS\u003c/span\u003e\u003c/font\u003e\u0026nbsp;mode is tricky. The problem again lies with user class static constructors which might call into the \u003ccode\u003exallocator\u003c/code\u003e API during construction/destruction. The C++ standard does not guarantee the order of static constructor calls between translation units\u0026nbsp;during dynamic initialization. Yet,\u0026nbsp;\u003ccode\u003exallocator\u003c/code\u003e must be initialized before any APIs are executed. Therefore, the first part of the solution is to preallocate enough static memory for each \u003ccode\u003eAllocator\u003c/code\u003e instance.\u0026nbsp; Of course, each allocator can use a different \u003ccode\u003eMAX_BLOCKS\u003c/code\u003e value as required. Using this mode, the global heap is never called.\u003c/p\u003e\n\n\u003cpre lang=\"c++\"\u003e\n// Update this section as necessary if you want to use static memory pools.\n// See also xalloc_init() and xalloc_destroy() for additional updates required.\n#define MAX_ALLOCATORS    12\n#define MAX_BLOCKS        32\n\n// Create static storage for each static allocator instance\nCHAR* _allocator8 [sizeof(AllocatorPool\u0026lt;CHAR[8], MAX_BLOCKS\u0026gt;)];\nCHAR* _allocator16 [sizeof(AllocatorPool\u0026lt;CHAR[16], MAX_BLOCKS\u0026gt;)];\nCHAR* _allocator32 [sizeof(AllocatorPool\u0026lt;CHAR[32], MAX_BLOCKS\u0026gt;)];\nCHAR* _allocator64 [sizeof(AllocatorPool\u0026lt;CHAR[64], MAX_BLOCKS\u0026gt;)];\nCHAR* _allocator128 [sizeof(AllocatorPool\u0026lt;CHAR[128], MAX_BLOCKS\u0026gt;)];\nCHAR* _allocator256 [sizeof(AllocatorPool\u0026lt;CHAR[256], MAX_BLOCKS\u0026gt;)];\nCHAR* _allocator396 [sizeof(AllocatorPool\u0026lt;CHAR[396], MAX_BLOCKS\u0026gt;)];\nCHAR* _allocator512 [sizeof(AllocatorPool\u0026lt;CHAR[512], MAX_BLOCKS\u0026gt;)];\nCHAR* _allocator768 [sizeof(AllocatorPool\u0026lt;CHAR[768], MAX_BLOCKS\u0026gt;)];\nCHAR* _allocator1024 [sizeof(AllocatorPool\u0026lt;CHAR[1024], MAX_BLOCKS\u0026gt;)];\nCHAR* _allocator2048 [sizeof(AllocatorPool\u0026lt;CHAR[2048], MAX_BLOCKS\u0026gt;)];    \nCHAR* _allocator4096 [sizeof(AllocatorPool\u0026lt;CHAR[4096], MAX_BLOCKS\u0026gt;)];\n\n// Array of pointers to all allocator instances\nstatic Allocator* _allocators[MAX_ALLOCATORS];\u003c/pre\u003e\n\n\u003cp\u003eThen when \u003ccode\u003exalloc_init()\u003c/code\u003e is called during dynamic initalization (via \u003ccode\u003eXallocInitDestroy()\u003c/code\u003e), placement \u003ccode\u003enew\u003c/code\u003e is used to initialize each \u003ccode\u003eAllocator\u003c/code\u003e instance into the static memory previously reserved.\u003c/p\u003e\n\n\u003cpre lang=\"c++\"\u003e\nextern \u0026quot;C\u0026quot; void xalloc_init()\n{\n    lock_init();\n\n#ifdef STATIC_POOLS\n    // For STATIC_POOLS mode, the allocators must be initialized before any other\n    // static user class constructor is run. Therefore, use placement new to initialize\n    // each allocator into the previously reserved static memory locations.\n    new (\u0026amp;_allocator8) AllocatorPool\u0026lt;CHAR[8], MAX_BLOCKS\u0026gt;();\n    new (\u0026amp;_allocator16) AllocatorPool\u0026lt;CHAR[16], MAX_BLOCKS\u0026gt;();\n    new (\u0026amp;_allocator32) AllocatorPool\u0026lt;CHAR[32], MAX_BLOCKS\u0026gt;();\n    new (\u0026amp;_allocator64) AllocatorPool\u0026lt;CHAR[64], MAX_BLOCKS\u0026gt;();\n    new (\u0026amp;_allocator128) AllocatorPool\u0026lt;CHAR[128], MAX_BLOCKS\u0026gt;();\n    new (\u0026amp;_allocator256) AllocatorPool\u0026lt;CHAR[256], MAX_BLOCKS\u0026gt;();\n    new (\u0026amp;_allocator396) AllocatorPool\u0026lt;CHAR[396], MAX_BLOCKS\u0026gt;();\n    new (\u0026amp;_allocator512) AllocatorPool\u0026lt;CHAR[512], MAX_BLOCKS\u0026gt;();\n    new (\u0026amp;_allocator768) AllocatorPool\u0026lt;CHAR[768], MAX_BLOCKS\u0026gt;();\n    new (\u0026amp;_allocator1024) AllocatorPool\u0026lt;CHAR[1024], MAX_BLOCKS\u0026gt;();\n    new (\u0026amp;_allocator2048) AllocatorPool\u0026lt;CHAR[2048], MAX_BLOCKS\u0026gt;();\n    new (\u0026amp;_allocator4096) AllocatorPool\u0026lt;CHAR[4096], MAX_BLOCKS\u0026gt;();\n\n    // Populate allocator array with all instances \n    _allocators[0] = (Allocator*)\u0026amp;_allocator8;\n    _allocators[1] = (Allocator*)\u0026amp;_allocator16;\n    _allocators[2] = (Allocator*)\u0026amp;_allocator32;\n    _allocators[3] = (Allocator*)\u0026amp;_allocator64;\n    _allocators[4] = (Allocator*)\u0026amp;_allocator128;\n    _allocators[5] = (Allocator*)\u0026amp;_allocator256;\n    _allocators[6] = (Allocator*)\u0026amp;_allocator396;\n    _allocators[7] = (Allocator*)\u0026amp;_allocator512;\n    _allocators[8] = (Allocator*)\u0026amp;_allocator768;\n    _allocators[9] = (Allocator*)\u0026amp;_allocator1024;\n    _allocators[10] = (Allocator*)\u0026amp;_allocator2048;\n    _allocators[11] = (Allocator*)\u0026amp;_allocator4096;\n#endif\n}\u003c/pre\u003e\n\n\u003cp\u003eAt application exit, the destructor for each \u003ccode\u003eAllocator\u003c/code\u003e is called manually.\u0026nbsp;\u003c/p\u003e\n\n\u003cpre lang=\"c++\"\u003e\nextern \u0026quot;C\u0026quot; void xalloc_destroy()\n{\n    lock_get();\n\n#ifdef STATIC_POOLS\n    for (INT i=0; i\u0026lt;MAX_ALLOCATORS; i++)\n    {\n        _allocators[i]-\u0026gt;~Allocator();\n        _allocators[i] = 0;\n    }\n#else\n    for (INT i=0; i\u0026lt;MAX_ALLOCATORS; i++)\n    {\n        if (_allocators[i] == 0)\n            break;\n        delete _allocators[i];\n        _allocators[i] = 0;\n    }\n#endif\n\n    lock_release();\n    lock_destroy();\n}\u003c/pre\u003e\n\n# Hiding the Allocator Pointer\n\n\u003cp\u003eWhen deleting memory, \u003ccode\u003exallocator\u003c/code\u003e needs the original \u003ccode\u003eAllocator\u003c/code\u003e instance\u0026nbsp;so the deallocation request can be routed to the correct \u003ccode\u003eAllocator\u003c/code\u003e instance for processing. Unlike \u003ccode\u003exmalloc()\u003c/code\u003e, \u003ccode\u003exfree()\u003c/code\u003e does not take a size and only uses a \u003ccode\u003evoid*\u003c/code\u003e argument. Therefore, \u003ccode\u003exmalloc()\u003c/code\u003e actually hides a pointer to the allocator\u0026nbsp;within an unused portion of the memory block by adding an additional 4-bytes (typical \u003ccode\u003esizeof(Allocator*)\u003c/code\u003e) to the request. The caller gets a pointer to the block\u0026rsquo;s client region where the hidden allocator pointer\u0026nbsp;is not overwritten.\u003c/p\u003e\n\n\u003cpre lang=\"C++\"\u003e\nextern \u0026quot;C\u0026quot; void *xmalloc(size_t size)\n{\n\u0026nbsp;\u0026nbsp; \u0026nbsp;lock_get();\n\n\u0026nbsp;\u0026nbsp; \u0026nbsp;// Allocate a raw memory block\u0026nbsp;\n\u0026nbsp;\u0026nbsp; \u0026nbsp;Allocator* allocator = xallocator_get_allocator(size);\n\u0026nbsp;\u0026nbsp; \u0026nbsp;void* blockMemoryPtr = allocator-\u0026gt;Allocate(size);\n\n\u0026nbsp;\u0026nbsp; \u0026nbsp;lock_release();\n\n\u0026nbsp;\u0026nbsp; \u0026nbsp;// Set the block Allocator* within the raw memory block region\n\u0026nbsp;\u0026nbsp; \u0026nbsp;void* clientsMemoryPtr = set_block_allocator(blockMemoryPtr, allocator);\n\u0026nbsp;\u0026nbsp; \u0026nbsp;return clientsMemoryPtr;\n}\u003c/pre\u003e\n\n\u003cp\u003eWhen \u003ccode\u003exfree()\u003c/code\u003e is called, the allocator pointer is extracted from the memory block so the correct \u003ccode\u003eAllocator\u003c/code\u003e instance can be called to deallocate the block.\u003c/p\u003e\n\n\u003cpre lang=\"C++\"\u003e\nextern \u0026quot;C\u0026quot; void xfree(void* ptr)\n{\n\u0026nbsp;\u0026nbsp; \u0026nbsp;if (ptr == 0)\n\u0026nbsp;\u0026nbsp; \u0026nbsp;\u0026nbsp;\u0026nbsp; \u0026nbsp;return;\n\n\u0026nbsp;\u0026nbsp; \u0026nbsp;// Extract the original allocator instance from the caller\u0026#39;s block pointer\n\u0026nbsp;\u0026nbsp; \u0026nbsp;Allocator* allocator = get_block_allocator(ptr);\n\n\u0026nbsp;\u0026nbsp; \u0026nbsp;// Convert the client pointer into the original raw block pointer\n\u0026nbsp;\u0026nbsp; \u0026nbsp;void* blockPtr = get_block_ptr(ptr);\n\n\u0026nbsp;\u0026nbsp; \u0026nbsp;lock_get();\n\n\u0026nbsp;\u0026nbsp; \u0026nbsp;// Deallocate the block\u0026nbsp;\n\u0026nbsp;\u0026nbsp; \u0026nbsp;allocator-\u0026gt;Deallocate(blockPtr);\n\n\u0026nbsp;\u0026nbsp; \u0026nbsp;lock_release();\n}\u003c/pre\u003e\n\n# Porting Issues\n\n\u003cp\u003eThe \u003ccode\u003exallocator\u003c/code\u003e is thread-safe when the locks are implemented for your target platform. The code provided has Windows locks. For other platforms, you\u0026#39;ll need to provide lock implementations for the four functions within \u003cem\u003exallocator.cpp\u003c/em\u003e:\u003c/p\u003e\n\n\u003cul\u003e\n\t\u003cli\u003e\u003ccode\u003elock_init()\u003c/code\u003e\u003c/li\u003e\n\t\u003cli\u003e\u003ccode\u003elock_get()\u003c/code\u003e\u003c/li\u003e\n\t\u003cli\u003e\u003ccode\u003elock_release()\u003c/code\u003e\u003c/li\u003e\n\t\u003cli\u003e\u003ccode\u003elock_destroy()\u003c/code\u003e\u003c/li\u003e\n\u003c/ul\u003e\n\n\u003cp\u003eWhen selecting a lock, use the fastest OS lock available to ensure \u003ccode\u003exallocator\u003c/code\u003e operates as efficiently as possible within a multi-threaded environment. If your system is single threaded, then leave the implementation for each of the above functions empty.\u003c/p\u003e\n\n\u003cp\u003eDepending on how \u003ccode\u003exallocator\u003c/code\u003e is used, it may be called before \u003ccode\u003emain()\u003c/code\u003e. This means \u003ccode\u003elock_get()\u003c/code\u003e/\u003ccode\u003elock_release()\u003c/code\u003e can be called before \u003ccode\u003elock_init()\u003c/code\u003e. Since the system is single threaded at this point, the locks aren\u0026rsquo;t necessary until the OS kicks off. However, just make sure \u003ccode\u003elock_get()\u003c/code\u003e/\u003ccode\u003elock_release()\u003c/code\u003e behaves correctly if \u003ccode\u003elock_init()\u003c/code\u003e isn\u0026rsquo;t called first. For instance, the check for \u003ccode\u003e_xallocInitialized\u003c/code\u003e below ensures the correct behavior by skipping the lock until \u003ccode\u003elock_init()\u003c/code\u003e is called.\u003c/p\u003e\n\n\u003cpre lang=\"C++\"\u003e\nstatic void lock_get()\n{\n    if (_xallocInitialized == FALSE)\n        return;\n\n    EnterCriticalSection(\u0026amp;_criticalSection); \n}\u003c/pre\u003e\n\n# Reducing Slack\n\n\u003cp\u003e\u003ccode\u003exallocator\u003c/code\u003e may return block sizes larger than the requested amount and the additional unused memory is called slack. For instance, for a request of 33 bytes, \u003ccode\u003exallocator\u003c/code\u003e returns a block of 64 bytes. The additional memory (64 \u0026ndash; (33 + 4) = 27 bytes) is slack and goes unused. Remember, if 33 bytes is requested an additional 4-bytes are required to hold the block size. So if a client requests 64-bytes, really the 128-byte allocator is used because 68-bytes are needed.\u003c/p\u003e\n\n\u003cp\u003eAdding additional allocators to handle block sizes other than powers of two offers more block sizes to minimize waste. Run your application and profile your \u003ccode\u003exmalloc()\u003c/code\u003e requested sizes with a bit of temporary debug code. Then add allocator block sizes for specifically handling those cases where a large number of blocks are being used.\u003c/p\u003e\n\n\u003cp\u003eIn the code below, an \u003ccode\u003eAllocator\u003c/code\u003e instance is created with a block size of 396 when a block between 257 and 396 is requested. Similarly, a block request of between 513 and 768 results in an \u003ccode\u003eAllocator\u003c/code\u003e to handle 768-byte blocks.\u003c/p\u003e\n\n\u003cpre lang=\"C++\"\u003e\n// Based on the size, find the next higher powers of two value.\n// Add sizeof(size_t) to the requested block size to hold the size\n// within the block memory region. Most blocks are powers of two,\n// however some common allocator block sizes can be explicitly defined\n// to minimize wasted storage. This offers application specific tuning.\nsize_t blockSize = size + sizeof(Allocator*);\nif (blockSize \u0026gt; 256 \u0026amp;\u0026amp; blockSize \u0026lt;= 396)\n\u0026nbsp;\u0026nbsp; \u0026nbsp;blockSize = 396;\nelse if (blockSize \u0026gt; 512 \u0026amp;\u0026amp; blockSize \u0026lt;= 768)\n\u0026nbsp;\u0026nbsp; \u0026nbsp;blockSize = 768;\nelse\n\u0026nbsp;\u0026nbsp; \u0026nbsp;blockSize = nexthigher\u0026lt;size_t\u0026gt;(blockSize);\u003c/pre\u003e\n\n\u003cp\u003eWith a minor amount of fine-tuning, you can reduce wasted storage due to slack based on your application\u0026#39;s memory usage patterns. If no tuning is required and using blocks solely based on powers of two is acceptable, the only lines of code required from the snippet above are:\u003c/p\u003e\n\n\u003cpre lang=\"C++\"\u003e\nsize_t blockSize;\nblockSize = nexthigher\u0026lt;size_t\u0026gt;(size + sizeof(Allocator*));\u003c/pre\u003e\n\n\u003cp\u003eUsing \u003ccode\u003exalloc_stats()\u003c/code\u003e, it\u0026rsquo;s easy to find which allocators are being used the most.\u003c/p\u003e\n\n\u003cpre lang=\"text\"\u003e\nxallocator Block Size: 128 Block Count: 10001 Blocks In Use: 1\nxallocator Block Size: 16 Block Count: 2 Blocks In Use: 2\nxallocator Block Size: 8 Block Count: 1 Blocks In Use: 0\nxallocator Block Size: 32 Block Count: 1 Blocks In Use: 0\u003c/pre\u003e\n\n# Allocator vs. xallocator\n\n\u003cp\u003eThe advantage of using \u003ccode\u003eAllocator\u003c/code\u003e is that the allocator block size exactly the size of the object and the minimum block size is only 4-bytes. The disadvantage is that the \u003ccode\u003eAllocator\u003c/code\u003e instance is \u003ccode\u003eprivate\u003c/code\u003e and only usable by that class. This means that the fixed block memory pool can\u0026#39;t easily be shared with other instances of similarly sized blocks. This can waste storage due to the lack of sharing between memory pools.\u003c/p\u003e\n\n\u003cp\u003e\u003ccode\u003exallocator\u003c/code\u003e, on the other hand, uses a range of different block sizes to satisfy requests and is thread-safe. The advantage is that the various sized memory pools are shared via the \u003ccode\u003exmalloc\u003c/code\u003e/\u003ccode\u003exfree\u003c/code\u003e interface, which can save storage, especially if you tune the block sizes for your specific application. The disadvantage is that even with block size tuning, there will always be some wasted storage due to slack. For small objects, the minimum block size is 8-bytes, 4-bytes for the free-list pointer and 4-bytes to hold the block size. This can become a problem with a large number of small objects.\u003c/p\u003e\n\n\u003cp\u003eAn application can mix \u003ccode\u003eAllocator\u003c/code\u003e and \u003ccode\u003exallocator\u003c/code\u003e usage in the same program to maximize efficient memory utilization as the designer sees fit.\u003c/p\u003e\n\n# Benchmarking\n\n\u003cp\u003eBenchmarking the \u003ccode\u003exallocator\u003c/code\u003e performance vs. the global heap on a Windows PC shows just how fast it is. An basic test of allocating and deallocating 20000 4096 and 2048 sized blocks in a somewhat interleaved fashion\u0026nbsp;tests the speed improvement. All tests run with maximum compiler speed optimizations. See the attached source code for the exact algorithm.\u003c/p\u003e\n\n\u003ch4\u003eAllocation Times in Milliseconds\u003c/h4\u003e\n\n\u003ctable class=\"ArticleTable\"\u003e\n\t\u003cthead\u003e\n\t\t\u003ctr\u003e\n\t\t\t\u003ctd\u003eAllocator\u003c/td\u003e\n\t\t\t\u003ctd\u003eMode\u003c/td\u003e\n\t\t\t\u003ctd\u003eRun\u003c/td\u003e\n\t\t\t\u003ctd\u003eBenchmark Time (mS)\u003c/td\u003e\n\t\t\u003c/tr\u003e\n\t\u003c/thead\u003e\n\t\u003ctbody\u003e\n\t\t\u003ctr\u003e\n\t\t\t\u003ctd\u003eGlobal Heap\u003c/td\u003e\n\t\t\t\u003ctd\u003eDebug Heap\u003c/td\u003e\n\t\t\t\u003ctd\u003e1\u003c/td\u003e\n\t\t\t\u003ctd\u003e1247\u003c/td\u003e\n\t\t\u003c/tr\u003e\n\t\t\u003ctr\u003e\n\t\t\t\u003ctd\u003eGobal Heap\u003c/td\u003e\n\t\t\t\u003ctd\u003eDebug Heap\u003c/td\u003e\n\t\t\t\u003ctd\u003e2\u003c/td\u003e\n\t\t\t\u003ctd\u003e1640\u003c/td\u003e\n\t\t\u003c/tr\u003e\n\t\t\u003ctr\u003e\n\t\t\t\u003ctd\u003eGlobal Heap\u003c/td\u003e\n\t\t\t\u003ctd\u003eDebug Heap\u003c/td\u003e\n\t\t\t\u003ctd\u003e3\u003c/td\u003e\n\t\t\t\u003ctd\u003e1650\u003c/td\u003e\n\t\t\u003c/tr\u003e\n\t\t\u003ctr\u003e\n\t\t\t\u003ctd\u003eGlobal Heap\u003c/td\u003e\n\t\t\t\u003ctd\u003eRelease Heap\u003c/td\u003e\n\t\t\t\u003ctd\u003e1\u003c/td\u003e\n\t\t\t\u003ctd\u003e32.9\u003c/td\u003e\n\t\t\u003c/tr\u003e\n\t\t\u003ctr\u003e\n\t\t\t\u003ctd\u003eGlobal Heap\u003c/td\u003e\n\t\t\t\u003ctd\u003eRelease Heap\u003c/td\u003e\n\t\t\t\u003ctd\u003e2\u003c/td\u003e\n\t\t\t\u003ctd\u003e33.0\u003c/td\u003e\n\t\t\u003c/tr\u003e\n\t\t\u003ctr\u003e\n\t\t\t\u003ctd\u003eGlobal Heap\u003c/td\u003e\n\t\t\t\u003ctd\u003eRelease Heap\u003c/td\u003e\n\t\t\t\u003ctd\u003e3\u003c/td\u003e\n\t\t\t\u003ctd\u003e27.8\u003c/td\u003e\n\t\t\u003c/tr\u003e\n\t\t\u003ctr\u003e\n\t\t\t\u003ctd\u003exallocator\u003c/td\u003e\n\t\t\t\u003ctd\u003eHeap Blocks\u003c/td\u003e\n\t\t\t\u003ctd\u003e1\u003c/td\u003e\n\t\t\t\u003ctd\u003e17.5\u003c/td\u003e\n\t\t\u003c/tr\u003e\n\t\t\u003ctr\u003e\n\t\t\t\u003ctd\u003exallocator\u003c/td\u003e\n\t\t\t\u003ctd\u003eHeap Blocks\u003c/td\u003e\n\t\t\t\u003ctd\u003e2\u003c/td\u003e\n\t\t\t\u003ctd\u003e5.0\u003c/td\u003e\n\t\t\u003c/tr\u003e\n\t\t\u003ctr\u003e\n\t\t\t\u003ctd\u003exallocator\u003c/td\u003e\n\t\t\t\u003ctd\u003eHeap Blocks\u003c/td\u003e\n\t\t\t\u003ctd\u003e3\u003c/td\u003e\n\t\t\t\u003ctd\u003e5.9\u003c/td\u003e\n\t\t\u003c/tr\u003e\n\t\u003c/tbody\u003e\n\u003c/table\u003e\n\n\u003cp\u003eWindows uses a debug heap when executing within the debugger. The debug heap adds extra safety checks slowing its performance. The release heap is much faster as the checks are disabled. The debug heap can be disabled within\u0026nbsp;Visual Studio by setting\u0026nbsp;\u003ccode\u003e\u003cstrong\u003e_NO_DEBUG_HEAP=1\u003c/strong\u003e\u003c/code\u003e\u0026nbsp;in the\u0026nbsp;\u003cstrong\u003eDebugging \u0026gt; Environment\u0026nbsp;\u003c/strong\u003eproject option.\u0026nbsp;\u003c/p\u003e\n\n\u003cp\u003eThe debug global heap is predictably the slowest at about 1.6\u0026nbsp;seconds. The release heap is much faster at ~30mS. This benchmark\u0026nbsp;test is very simplistic and a more realistic scenario with varying blocks sizes and random new/delete intervals might produce different results. However, the basic point is illustrated nicely; the memory manager is slower than allocator and highly dependent on the platform\u0026#39;s implementation.\u003c/p\u003e\n\n\u003cp\u003eThe \u003ccode\u003exallocator\u003c/code\u003e\u0026nbsp;running heap blocks mode very fast once the free-list is populated with blocks obtained from the heap. Recall that the heap blocks mode relies upon the global heap to get new blocks, but then recycles them into the free-list for later use. Run 1 shows the allocation hit creating the memory blocks at 17mS. Subsequent benchmarks clock in a very fast 5mS since the free-list is fully populated.\u0026nbsp;\u003c/p\u003e\n\n\u003cp\u003eAs the benchmarking shows, the \u003ccode\u003exallocator\u003c/code\u003e is highly efficient and about five times\u0026nbsp;faster than the global heap on a Windows PC. On an\u0026nbsp;ARM STM32F4 CPU built using a Keil compiler I\u0026#39;ve see well over a 10x speed increase.\u0026nbsp;\u003c/p\u003e\n\n# Reference articles\n\n\u003cul\u003e\n\t\u003cli\u003e\u003ca href=\"http://www.codeproject.com/Articles/1083210/An-Efficient-Cplusplus-Fixed-Block-Memory-Allocato\"\u003e\u003cstrong\u003eAn Efficient C++ Fixed Block Memory Allocator\u003c/strong\u003e\u003c/a\u003e\u0026nbsp;by David Lafreniere\u003c/li\u003e\n\t\u003cli\u003e\u003ca href=\"http://www.codeproject.com/Articles/1089905/A-Custom-STL-std-allocator-Replacement-Improves-Pe\"\u003e\u003cstrong\u003eA Custom STL std::allocator Replacement Improves Performance\u003c/strong\u003e\u003c/a\u003e by David Lafreniere\u003c/li\u003e\n\u003c/ul\u003e\n\n# Conclusion\n\n\u003cp\u003eA medical device I worked on had a commercial GUI library that utilized the heap extensively. The size and frequency of the memory requests couldn\u0026rsquo;t be predicted or controlled. Using the heap is such an uncontrolled fashion is a no-no on a medical device, so a solution was needed. Luckily, the GUI library had a means to replace \u003ccode\u003emalloc()\u003c/code\u003e and \u003ccode\u003efree()\u003c/code\u003e with our own custom implementation. \u003ccode\u003exallocator\u003c/code\u003e solved the heap speed and fragmentation problem making the GUI framework a viable solution on that product.\u003c/p\u003e\n\n\u003cp\u003eIf you have an application that really hammers the heap and is causing slow performance, or if you\u0026rsquo;re worried about a fragmented heap fault, integrating \u003ccode\u003eAllocator\u003c/code\u003e/\u003ccode\u003exallocator\u003c/code\u003e may help solve those problems.\u003c/p\u003e\n\n\n\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fendurodave%2Fxallocator","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fendurodave%2Fxallocator","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fendurodave%2Fxallocator/lists"}