{"id":20007126,"url":"https://github.com/valkmjolnir/brainfuck-jit","last_synced_at":"2026-06-06T18:32:12.152Z","repository":{"id":127293803,"uuid":"421050674","full_name":"ValKmjolnir/brainfuck-jit","owner":"ValKmjolnir","description":"Brainfuck Just-In-Time compiler written in C++","archived":false,"fork":false,"pushed_at":"2025-08-08T16:35:32.000Z","size":44,"stargazers_count":5,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-08-08T17:29:56.801Z","etag":null,"topics":["brainfuck","compiler","cpp","esolang","esoteric-interpreter","esoteric-language","esoteric-programming-language","interpreter","jit","just-in-time"],"latest_commit_sha":null,"homepage":"","language":"Brainfuck","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ValKmjolnir.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-10-25T14:08:48.000Z","updated_at":"2025-08-08T16:35:36.000Z","dependencies_parsed_at":null,"dependency_job_id":"32ea5777-daca-42e9-8884-1d6b2b884122","html_url":"https://github.com/ValKmjolnir/brainfuck-jit","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/ValKmjolnir/brainfuck-jit","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ValKmjolnir%2Fbrainfuck-jit","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ValKmjolnir%2Fbrainfuck-jit/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ValKmjolnir%2Fbrainfuck-jit/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ValKmjolnir%2Fbrainfuck-jit/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ValKmjolnir","download_url":"https://codeload.github.com/ValKmjolnir/brainfuck-jit/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ValKmjolnir%2Fbrainfuck-jit/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33995625,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-06T02:00:07.033Z","response_time":107,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["brainfuck","compiler","cpp","esolang","esoteric-interpreter","esoteric-language","esoteric-programming-language","interpreter","jit","just-in-time"],"created_at":"2024-11-13T06:14:47.640Z","updated_at":"2026-06-06T18:32:12.144Z","avatar_url":"https://github.com/ValKmjolnir.png","language":"Brainfuck","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Brainfuck Just-In-Time Compiler\n\n## __Introduction__\n\nBrainfuck is a very interesting programming language that has only 8 operators:\n\n|Operator|Code in C/C++|\n|:----|:----|\n|`+`|`buff[ptr]++`|\n|`-`|`buff[ptr]--`|\n|`\u003e`|`ptr++`|\n|`\u003c`|`ptr--`|\n|`[`|`if (!buff[ptr]) goto ']'`|\n|`]`|`if (buff[ptr] goto '['`|\n|`,`|`buff[ptr] = getchar()`|\n|`.`|`putchar(buff[ptr])`|\n\nThis simple syntax makes brainfuck a great language for me to learn how to build an interpreter and __JIT(just-in-time)__ compiler.\n\n## __Brainfuck Interpreter__\n\nThis project has a simple interpreter for brainfuck,\nusing switch-threading:\n\n```C++\nfor (size_t i = 0; i \u003c code.size(); ++i) {\n    switch (code[i].op) {\n        case op_add: buff[p] += code[i].num; break;\n        case op_sub: buff[p] -= code[i].num; break;\n        case op_addp: p += code[i].num; break;\n        case op_subp: p -= code[i].num; break;\n        case op_jt: if (buff[p]) i = code[i].num; break;\n        case op_jf: if (!buff[p]) i = code[i].num; break;\n        case op_in: buff[p] = getchar(); break;\n        case op_out: putchar(buff[p]); break;\n    }\n}\n```\n\n### __Basic Optimization__\n\nTo optimize the efficiency of the interpreter,\ni count consecutive operators instead of just translating operators into an opcode.\nYou will see the structure of opcode in `jit.cpp`.\n\nFor example:\n\n|bf code|opcode|\n|:----|:----|\n|`+++`|`buf[p] += 3`|\n|`----`|`buf[p] -= 4`|\n|`\u003e\u003e\u003e\u003e\u003e`|`p += 5`|\n|`\u003c\u003c`|`p -= 2`|\n\n## __Just-In-Time Compiler__\n\n### __mmap__\n\nAfter generating opcodes,\nit's quite easy for us to generate machine codes into a memory space allocated by `mmap`.\n\nThis memory space must be `read/write/exec` so we could execute the machine codes in this memory space.\nYou could see the `mmap` in `amd64jit::amd64jit(const size_t)` in file `amd64jit.h`.\n\nI use a global u8 array `buff[0x20000]` to be the paper of brainfuck machine(and `rbx` stores the pointer),\nand remember to use memset to clean the stack space to zero-filled.\n\n```C++\n/* set bf machine's paper pointer */\nmem.push({0x48, 0xbb}).push64((uint64_t)buff); // movq $buff, %rbx\n```\n\n### __Add \u0026 Sub Operations__\n\nThese four operators are not so difficult to translate to machine codes:\n\n```C++\ncase op_add:  mem.push({0x80, 0x03, (uint8_t)(op.num \u0026 0xff)}); break; // addb $op.num, (%rbx)\ncase op_sub:  mem.push({0x80, 0x2b, (uint8_t)(op.num \u0026 0xff)}); break; // subb $op.num, (%rbx)\ncase op_addp: mem.push({0x48, 0x81, 0xc3}).push32(op.num);    break; // add $op.num, %rbx\ncase op_subp: mem.push({0x48, 0x81, 0xeb}).push32(op.num);    break; // sub $op.num, %rbx\n```\n\n### __Library Function `putchar` \u0026 `getchar`__\n\n#### __putchar__\n\n```C++\nint putchar(int);\n```\n\n`op_out` uses the `putchar`,\nwrite a demo and use objdump to see how the gcc and clang generate the machine code that calls the function,\nthen just copy them :)\n\n```C++\nmem.push({0x48, 0xb8}).push64((uint64_t)putchar); // movabs $putchar, %rax\n#ifndef _WIN32\nmem.push({0x0f, 0xbe, 0x3b}); // movsbl (%rbx), %edi\n#else\nmem.push({0x0f, 0xbe, 0x0b}); // movsbl (%rbx), %ecx\n#endif\nmem.push({0xff, 0xd0}); // callq *%rax\n```\n\nYou may find that there's a small difference between generated machine code on Windows platform.\nThis is because the rule of parameter passing in __call convention__ of Windows is different from Linux/macOS/Unix.\nAnd Linux/macOS/Unix use `rdi` to get the first parameter, but Windows uses `rcx`.\n\nAlthough JIT-compiler developers should remember this rule,\nit is quite easier to remember x86_64/amd64 call convention than x86_32...\n\n#### __getchar__\n\n```C++\nint getchar();\n```\n\n`op_in` uses the `getchar`,\nalso we just use the objdump to see how gcc/clang generate the code,\nand just copy them :)\n\nLuckily, on Windows/Linux/macOS/Unix platform, the return value `int` will all be stored in register `rax`. And we just need to mov the low 8-bits of `rax` to `rbx[0]` (aka `movsbl %al,(%rbx)`).\n\n```C++\nmem.push({0x48, 0xb8})\n   .push64((uint64_t)getchar); // movabs $getchar, %rax\nmem.push({0xff, 0xd0});        // callq *%rax\nmem.push({0x88, 0x03});        // movsbl %al, (%rbx)\n```\n\nSo we don't need to write `#ifndef _WIN32` and so on :)\n\n### __Jump Operation__\n\n`je` and `jne` are two difficulties in this project.\nYou must calculate the distance of two jump labels to make sure they work correctly.\n\n```C++\namd64jit\u0026 amd64jit::je() {\n    push({0x0f, 0x84}).push32(0x0); // je\n    stk.push(ptr);\n    return *this;\n}\n\namd64jit\u0026 amd64jit::jne() {\n    push({0x0f, 0x85}).push32(0x0); // jne\n    uint8_t* je_next = stk.top();\n    stk.pop();\n    \n    uint8_t* jne_next = ptr;\n    uint64_t p0 = jne_next - je_next;\n    uint64_t p1 = je_next - jne_next;\n    jne_next[-4] = p1 \u0026 0xff;\n    jne_next[-3] = (p1 \u003e\u003e 8) \u0026 0xff;\n    jne_next[-2] = (p1 \u003e\u003e 16) \u0026 0xff;\n    jne_next[-1] = (p1 \u003e\u003e 24) \u0026 0xff;\n    je_next[-4] = p0 \u0026 0xff;\n    je_next[-3] = (p0 \u003e\u003e 8) \u0026 0xff;\n    je_next[-2] = (p0 \u003e\u003e 16) \u0026 0xff;\n    je_next[-1] = (p0 \u003e\u003e 24) \u0026 0xff;\n    return *this;\n}\n```\n\nop_jf(`[`) uses the `je` and op_jt(`]`) uses the `jne`.\n\n### __Conclusion__\n\n|bf code|opcode|machine code|\n|:----|:----|:----|\n|`+`|op_add|`addb $op.num, (%rbx)`|\n|`-`|op_sub|`subb $op.num, (%rbx)`|\n|`\u003e`|op_addp|`add $op.num, %rbx`|\n|`\u003c`|op_subp|`sub $op.num, %rbx`|\n|`[`|op_jf|`je label`|\n|`]`|op_jt|`jne label`|\n|`,`|op_in|`callq *%rax` \u0026 `movsbl %al, (%rbx)`|\n|`.`|op_out|`callq *%rax`|\n\n## __Simple Optimization__\n\nHere's a simple pattern that could be optimized:\n\n```bf\n[-]\n```\n\nOr:\n\n```bf\n[+]\n```\n\nThis means we should set `buff[p]` to zero.\nSo we could add another opcode `op_setz`:\n\n|bf code|opcode|machine code|\n|:----|:----|:----|\n|`[-]`|op_setz|`movb $0, (%rbx)`|\n|`[+]`|op_setz|`movb $0, (%rbx)`|\n\n```c++\n// movb $0, (%rbx)\ncase op_setz: mem.push({0xc6, 0x03, 0x00}); break;\n```\n\n## Advanced Optimization\n\nNot implemented here, but need to mention.\n\n```bf\n-\u003c++\u003e-\n```\n\nCould be optimized to:\n\n```bf\n--\u003c++\u003e\n```\n\nWhich is called `canonicalization`. This needs the compiler to have the ability to analyze memory/variable access patterns in bf.\n\nAnd more possible patterns are waiting to be optimized:\n\n```bf\n[-\u003c+\u003e]\n[-\u003c\u003c-\u003e\u003e]\n[-\u003e\u003e+\u003c\u003c]\n[-\u003e++\u003e\u003e\u003e+++++\u003e++\u003e+\u003c\u003c\u003c\u003c\u003c\u003c]\n```\n\n## __More__\n\nWant to check the output machine code of different CPU arch?\n\nYou may need this website: [__godbolt.org__](https://godbolt.org/)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvalkmjolnir%2Fbrainfuck-jit","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvalkmjolnir%2Fbrainfuck-jit","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvalkmjolnir%2Fbrainfuck-jit/lists"}