{"id":21688432,"url":"https://github.com/giovanni-iannaccone/memory-allocator","last_synced_at":"2025-10-28T19:17:12.622Z","repository":{"id":264360025,"uuid":"884948220","full_name":"giovanni-iannaccone/memory-allocator","owner":"giovanni-iannaccone","description":"Cross platform memory allocator 💿","archived":false,"fork":false,"pushed_at":"2025-01-06T20:00:22.000Z","size":35,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-25T12:41:27.811Z","etag":null,"topics":["allocator","c-plus-plus","calloc","cross-platform-development","free","low-level","malloc","memory-allocation","realloc"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/giovanni-iannaccone.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-11-07T17:11:54.000Z","updated_at":"2025-01-06T20:00:26.000Z","dependencies_parsed_at":"2024-11-23T18:18:11.953Z","dependency_job_id":"8a18a057-149b-4774-b874-1dedea710c04","html_url":"https://github.com/giovanni-iannaccone/memory-allocator","commit_stats":null,"previous_names":["giovanni-iannaccone/memory-allocator"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/giovanni-iannaccone%2Fmemory-allocator","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/giovanni-iannaccone%2Fmemory-allocator/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/giovanni-iannaccone%2Fmemory-allocator/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/giovanni-iannaccone%2Fmemory-allocator/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/giovanni-iannaccone","download_url":"https://codeload.github.com/giovanni-iannaccone/memory-allocator/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244610616,"owners_count":20481025,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["allocator","c-plus-plus","calloc","cross-platform-development","free","low-level","malloc","memory-allocation","realloc"],"created_at":"2024-11-25T17:15:00.732Z","updated_at":"2025-10-28T19:17:12.533Z","avatar_url":"https://github.com/giovanni-iannaccone.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 💿 Memory allocator \n\n## 📦 Prerequisites\n- Familiarity with C/C++ programming\n  \n## ⚡ How it works\nWhen a process is executed, the operating system allocates a Virtual Address Space (VAS) for it. This abstraction makes each process believe it has access to the entire memory of the computer, while in reality, it is working with a virtualized view of the physical memory.\n\nThe memory of a process is organized into distinct segments, listed below in order of increasing addresses: \u003cbr/\u003e\n( shown from lower address to higher address )\n1. *Text segment* contains executable code\n2. *Data segment* stores non-zero initialized static data\n3. *Bss segment* holds uninitialized static variables or variables explicitly initialized to zero\n4. ***Heap* used for dynamically allocated memory (our focus in this explanation)**\n5. *Unmapped area* \n6. *Stack* is used to store function activetion records, local variables and parameters\n8. command line arguments, environment variables\n\n### 🔊 Stack and Heap Growth\n**Stack**: Grows downward (toward lower addresses). The current stack position is tracked by the stack pointer (`sp`), and increasing the stack size reduces the value of `sp`.\n**Heap**: Grows upward (toward higher addresses). Its top is tracked by the program break (`brk`). Increasing the heap size moves `brk` to a higher address.\n\nIn high-level programming, we deal with objects, but from the perspective of memory, these are just blocks of raw bytes. The system views these blocks as a series of bits, which can be cast to any data type at runtime.\n\nA fundamental method to increase available memory is by moving the brk. In Linux, this can be done with the `sbrk` system call:\n- `sbrk(0)` gives the current brk address\n- `sbrk(n)` ncreases brk by n bytes and returns the previous brk\n- if sbrk fails, it returns `(void*)-1`\n  \nUsing sbrk, a basic memory allocation function might look like this:\n```c++\nvoid *malloc(size_t size) {\n  void *p = sbrk(0);\n  void *request = sbrk(size);\n  if (request == (void*) -1)\n    return nullptr; // sbrk failed.\n\n  return p;\n}\n```\nThis naive approach raises several issues:\n1. how do we free that memory ?\n2. how can we recycle a block ?\n3. is it efficient to make a system call for every memory request ?\n\nClearly, a more sophisticated strategy is needed.\nTo build a better allocator, we need to store metadata about each block of memory alongside the block itself. This metadata is stored in a header and can include:\n```c++\nstruct Block {\n  size_t size;              // block's size\n  bool free;                // the block is currently free\n  Block *prev;              // the block before this\n  Block *next;              // the block after this\n  void *data;               // a pointer to the first word of user data, aka payload pointer\n};\n```\n### ✔ Memory Alignment\nFor faster access, a memory block should be **aligned** to it's machine's word size. What does this mean ? Each block should be of a multiple\nof 8 bytes on x64 machines or 4 bytes on x32 machines. To align a size to a word's size we can use this function: \n\n```c++\nstatic inline size_t align(size_t n) {\n  return (n + sizeof(void*) - 1) \u0026 ~(sizeof(void*) - 1);\n}\n```\n### 👍 Improved Allocation Workflow\nNow, we can request new memory from the OS, add values inside the block structure, and return the payload pointer. What we built so far is just \na sequential allocator; it asks for more and more memory, bumping the brk, and is likely to run out of space eventually. This is not a good\nimplementation (but still a valid one), so we are going to improve it.\n\nAn important point is that we will need to work with headers, so it can be crucial to have a function that returns a block's header:\n```c++\nstatic Block* getBlock(void *data) {\n    return (Block *)((char *)data - sizeof(Block) + sizeof(data));\n}\n```\n\nTo improve our allocator we need the possibility to free blocks, to achieve this we have to set the used flag to `false`:\n```c++\nvoid _free(void *data) {\n  auto block = getBlock(data);\n  block-\u003eused = false;\n}\n```\nA better allocator:\n- Maintains a linked list of blocks.\n- Searches for free blocks to reuse memory.\n- Expands the heap only when necessary using sbrk.\n\n### 🔍 Finding Free Blocks\nWe need a function to search for a free block that meets the requested size. Common algorithms include:\n- *First-fit* returns the first block bigger than the requested size\n- *Next-fit* is a variant of the first-fit, it returns the first block bigger than the request size searching from the last successful position on the heap\n- *Best-fit* returns the block that has the nearest ( but bigger ) size to the requested one\n\nEach algorithm iterates over the linked list of blocks to find a suitable one. For the algorithms to function, we maintain these pointers:\n```c++\nstatic Block *heapStart = nullptr;        // Address of the first block in the heap\nstatic Block *searchStart = nullptr;      // Starting point for searches\nstatic Block *top = nullptr;              // Current top of the heap\n```\nIf no suitable free block is found, the allocator requests more memory from the OS:\n```c++\nstatic Block *requestFromOS(size_t size) {\n    size = align(size + sizeof(Block));\n    Block* block = (Block*)sbrk(size);\n\n    if (block == (void*)-1)\n        return NO_FREE_BLOCK;\n\n    block-\u003esize = size - sizeof(Block);\n    block-\u003efree = FREE;\n    block-\u003enext = nullptr;\n    block-\u003eprev = top;\n\n    if (top)\n        top-\u003enext = block;\n\n    top = block;\n\n    if (!heapStart)\n        heapStart = block;\n\n    return block;\n}\n```\nThis function will request memory from the operating system, cast it to `Block` and set field values.\nNow our malloc will look like this:\n```c++\nvoid *_malloc(size_t size) {\n    if (size \u003c= 0)\n        return nullptr;\n\n    size = align(size);\n\n    Block* block = findBlock(size);\n\n    if (block != NO_FREE_BLOCK) {\n        block-\u003efree = NOT_FREE;\n        return block-\u003edata;\n    }\n\n    block = requestFromOS(size + sizeof(Block));\n\n    if (block != NO_FREE_BLOCK) {\n        block-\u003esize = size;\n        block-\u003efree = NOT_FREE;\n        return block-\u003edata;\n    }\n\n    return nullptr;\n}\n```\n### 🧩 Merging \nIt can really be useful to merge two free blocks, in order to **iterate faster**, **avoid fragmentation** and **reduce the number of system call**.\n```c++\nstatic bool canMerge(Block *block) {\n    Block* next = block-\u003enext;\n    return next != nullptr \u0026\u0026 next \u003c= top \u0026\u0026 next-\u003efree;\n}\n\nstatic void merge(Block *block) {\n    Block* next = block-\u003enext;\n\n    if (next == nullptr || next-\u003efree == NOT_FREE) \n        return;\n\n    block-\u003esize += next-\u003esize + sizeof(Block);\n    block-\u003enext = next-\u003enext;\n\n    if (block-\u003enext)\n        block-\u003enext-\u003eprev = block;   \n}\n```\nAnd when do we merge two blocks ? When we free them, so we have to update our free function \n```c++\nvoid _free(void* data) {\n    if (data == nullptr)\n        return;\n\n    Block* block = getBlock(data);\n    block-\u003efree = FREE;\n\n    while (block \u0026\u0026 canMerge(block)) {\n        merge(block);\n    }\n\n    Block *prev = block-\u003eprev;\n    while (prev \u0026\u0026 prev-\u003efree \u0026\u0026 canMerge(prev)) {\n        merge(prev);\n        prev = prev-\u003eprev;\n    }\n```\n### 🍴 Splitting\nWe are almost done, what if we have a block of size 64 and we only needed 32 ? Our malloc is going to take all of the 64 block.\nIt is very important to split big blocks\n```c++\nstatic bool canSplit(Block *block, size_t size) {\n    return block-\u003esize \u003e= size + sizeof(Block);\n}\n\nstatic void split(Block *block, size_t size) {\n    size_t originalSize = block-\u003esize;\n    block-\u003esize = size;\n\n    Block* newBlock = (Block *)((char *)block + sizeof(Block) + size);\n    newBlock-\u003esize = originalSize - size;\n    newBlock-\u003efree = FREE;\n\n    newBlock-\u003enext = block-\u003enext;\n    if (newBlock-\u003enext)\n        newBlock-\u003enext-\u003eprev = newBlock;\n\n    block-\u003enext = newBlock;\n    newBlock-\u003eprev = block;\n}\n```\n### 🐧 Cross-Platform Compatibility\nNow that we've completed our memory allocator, I wanted to make it cross-platform, so I mapped ```sbrk``` to ```VirtualAlloc``` using this macro:\n```c++\n#if defined(__WIN32__) || defined(__WIN64__)\n    #include \u003cwindows.h\u003e\n\n    #define sbrk(X) fake_sbrk(X)\n\n    void* fake_sbrk(size_t increment) {\n        constexpr size_t MAX_HEAP_SIZE = 1024 * 1024;\n\n        static char *heapStart = nullptr;\n        static char *currentBreak = nullptr;\n\n        if (heapStart == nullptr) {\n            heapStart = (char*)VirtualAlloc(nullptr, MAX_HEAP_SIZE, MEM_RESERVE | MEM_COMMIT, PAGE_READWRITE);\n            if (heapStart == nullptr)\n                return (void *)-1;\n\n            currentBreak = heapStart;\n        }\n\n        char *newBreak = currentBreak + increment;\n        if (newBreak \u003c heapStart || newBreak \u003e heapStart + MAX_HEAP_SIZE)\n            return (void *)-1;\n\n        void *oldBreak = currentBreak;\n        currentBreak = newBreak;\n\n        return oldBreak;\n    }\n#else \n    #include \u003csys/mman.h\u003e\n    #include \u003cunistd.h\u003e\n#endif\n```\n\n## 🌎 Resources \n- Glibc malloc implementation: https://sourceware.org/glibc/wiki/MallocInternals\n- Memory allocator: http://dmitrysoshnikov.com/compilers/writing-a-memory-allocator/\n- Virtual memory: https://www.youtube.com/watch?v=k0OOmaMwcV4\n\n🐞 Happy low level programming...\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgiovanni-iannaccone%2Fmemory-allocator","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgiovanni-iannaccone%2Fmemory-allocator","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgiovanni-iannaccone%2Fmemory-allocator/lists"}