{"id":23690770,"url":"https://github.com/dispatchcode/x64-instruction-decoder","last_synced_at":"2025-09-02T20:31:29.867Z","repository":{"id":41545150,"uuid":"304070879","full_name":"DispatchCode/x64-Instruction-Decoder","owner":"DispatchCode","description":"An x86/x64 instruction disassembler written in C","archived":false,"fork":false,"pushed_at":"2024-07-16T19:13:54.000Z","size":131,"stargazers_count":31,"open_issues_count":1,"forks_count":8,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-08-29T14:39:30.136Z","etag":null,"topics":["architectures","assembly","c","disassembler","disassembler-library","instruction-decoding","instruction-set","low-level","machine-code","reverse-engineering","x64","x86"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/DispatchCode.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-10-14T16:19:35.000Z","updated_at":"2025-06-24T22:41:00.000Z","dependencies_parsed_at":"2025-02-26T03:06:13.485Z","dependency_job_id":"0afe2a64-fb18-4363-b747-70fbe9f1edb5","html_url":"https://github.com/DispatchCode/x64-Instruction-Decoder","commit_stats":null,"previous_names":["dispatchcode/x64-instruction-disassembler","dispatchcode/machine-code-analyzer"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/DispatchCode/x64-Instruction-Decoder","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DispatchCode%2Fx64-Instruction-Decoder","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DispatchCode%2Fx64-Instruction-Decoder/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DispatchCode%2Fx64-Instruction-Decoder/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DispatchCode%2Fx64-Instruction-Decoder/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/DispatchCode","download_url":"https://codeload.github.com/DispatchCode/x64-Instruction-Decoder/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DispatchCode%2Fx64-Instruction-Decoder/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":273344602,"owners_count":25089012,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-02T02:00:09.530Z","response_time":77,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["architectures","assembly","c","disassembler","disassembler-library","instruction-decoding","instruction-set","low-level","machine-code","reverse-engineering","x64","x86"],"created_at":"2024-12-30T02:51:58.403Z","updated_at":"2025-09-02T20:31:29.608Z","avatar_url":"https://github.com/DispatchCode.png","language":"C","readme":"[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)\n\n# x64ID ~ x64 Instruction Decoder\n\nA x86/x64 machine code decoder. It is useful to get instructions' length and identify each of its fields.\n\nHere some scenarios where x64ID can be used:\n\n- write your disassembler from scratch\n- as a base for a VM protection [\u003csup\u003e[1]\u003c/sup\u003e](#user-content-res1).\n- reverse engineering scenarios\n- swapping instructions with others (eg. substitution like `MOV EAX, 0` with `XOR EAX, EAX`)\n- get mnemonic rapresentation (*currently not implemented*)\n- others (as ideas will come to mind...)\n\n___\n\n- [x64ID ~ x64 Instruction Decoder](#x64id--machine-code-analyzer)\n  * [Supported architectures and features](#supported-architectures-and-features)\n    + [Features on development](#features-on-development)\n  * [API](#api)\n    + [Instruction struct](#instruction-struct)\n      - [`REX` union](#rex-union)\n      - [`ModRm` union](#modrm-union)\n      - [`SIB` union](#sib-union)\n      - [`vex_info` struct](#vex_info-struct)\n  * [Examples](#examples)\n      - [A practical example: sum of two vectors using SIMD instruction](#a-practical-example-sum-of-two-vectors-using-simd-instruction)\n      - [Another example: architecture x64, VEX prefix with YMM register](#another-example-architecture-x64-vex-prefix-with-ymm-register)\n  * [Enabling / Disabling features](#enabling--disabling-features)\n  * [Function Length detection](#find-function-length) 🌟*New❗*🌟\n  * [Tests](#tests)\n  * [Useful resources](#useful-resources)\n  * [Notes](#notes)\n\n## Supported architectures and features\n\n**Architectures**:\n\n✅ x86 \u003cbr\u003e\n✅ x64 \u003cbr\u003e\n\n**Opcodes**:\n\n✅ 1-byte OPs \u003cbr\u003e\n✅ 2-byte OPs \u003cbr\u003e\n✅ 3-byte OPs, 0x38 and 0x3A \u003cbr\u003e\n\n**Fields**:\n\n✅ prefixes \u003cbr\u003e\n✅ VEX prefix (0xC4, 0xC5) \u003cbr\u003e\n✅ ModRm \u003cbr\u003e\n✅ REX prefix \u003cbr\u003e\n✅ SIB \u003cbr\u003e\n✅ Imm \u003cbr\u003e\n✅ Disp \u003cbr\u003e\n❌ XOP prefix \u003cbr\u003e\n\n**Instruction Set**:\n\n✅ x86 \u0026 x64 \u003cbr\u003e\n✅ SIMD extension \u003cbr\u003e\n✅ AVX extension \u003cbr\u003e\n❌ AVX-512 (EVEX prefix) \u003cbr\u003e\n❌ 3DNow! \u003cbr/\u003e\n\n### Features on development\n\n🎯 XOP support \u003cbr\u003e\n🎯 AVX-512 (EVEX prefix) \u003cbr\u003e\n🎯 Machine code to assembly mnemonics \u003cbr\u003e\n🎯 Others (as ideas will come to mind...)\n\n## API\n\nx64ID exposes only one function and some structs to complete its goal:\n\n```C\nint x64id_decode(struct instruction *instr, enum supported_architecture arch, char *data_src, int offset);\n```\n\n| Parameter      | Type     | Explanation | Required |\n|----------------|:--------:|-------------|:--------:|\n| `instr`        | `struct instruction` | A reference to the struct that will contain the analysis result. See [below](#instruction-struct) for more informations. | YES |\n| `arch`         | `enum supported_architecture` | The achitecture type. Use `1` for `x86` and `2` for `x64` | YES |\n| `data_source`  | `char*`  | A data buffer with the data to be analyzed | YES |\n| `offset`       | `int`    | An offset to be added to the starting address of `data_buffer` | NO |\n\n**Return**:\n\n`x64id_decode` returns the length of the decoded instruction. Its value can also be accessed from `instr.length`.\n\n\u003e :information_source: **Notes**\n\u003e Internally, x64ID does not use dynamic allocation to avoid overhead.\n\n___\n\n### Instruction struct\n\nHere below how you can use the struct. More infos and the other structs can be found in the [header file](https://github.com/DispatchCode/Machine-Code-Analyzer/blob/master/src/x64id.h#L355).\n\n| Field Name        | Type              | Description |\n|-------------------|:-----------------:|-------------|\n| `prefixes`        | `uint8_t[4]`      | Store the prefixes of the instrucion, like Segment Override, Address Size and 2 and 3-byte escape opcodes (0x0f, 0x38, 0x3A)  |\n| `rex`             | `union`           | Union of five fields: `value`, `rex_b`, `rex_r`, `rex_x`, `rex_w` (see [below](#rex-union)). |\n| `op`              | `uint8_t`         | The opcode of the instruction |\n| `modrm`           | `union`           | Union of four fields: `value`, `rm`, `reg` and `mod` (see [below](#modrm-union)). |\n| `disp`            | `uint64_t`        | Displacement field |\n| `imm`             | `uint64_t`        | Immediate value (a number) |\n| `label`           | `uint32_t`        | Address of Jcc/JMP, if present |\n| `_vex`            | `struct vex_info` | Available only if [`_ENABLED_VEX_INFO`](#enabling--disabling-features) is defined. Described [below](#vex_info-struct). |\n| `instr`           | `uint8_t[15]`     | Available only if [`_ENABLE_RAW_BYTES`](#enabling--disabling-features) is defined. |\n| `sib`             | `union`           | Union of four fields: `value`, `base`, `index` and `scaled` (see [below](#sib-union)). |\n| `vex`             | `uint8_t[3]`      | 0xC4 or 0xC5 followed by 1 or 2 bytes |\n| `length`          | `int`             | The instruction length (in bytes) |\n| `disp_len`        | `int`             | The displacement size (in bytes) |\n| `imm_len`         | `int`             | The imm size |\n| `vex_cnt`         | `int8_t`          | Count how many VEX prefixes are available |\n| `prefix_cnt`      | `int8_t`          | Count how many prefixes are available |\n| `set_prefix`      | `uint16_t`        | A field against which is possible to check if a determined prefix (belonging to `prefixes` enum) is present. |\n| `set_field`       | `uint16_t`        | A field against which is possible to check if a determined feature (belonging to `instruction_feature` enum) is available (e.g. FPU, SIB, DISP,...) |\n| `jcc_type`        | `uint8_t`         | The type of jump: Jcc or JMP with 1 or 2-bytes (refer to jmp_type enum) \n\n___\n\n#### `REX` union\n\n| Field Name        | Type       | Description |\n|-------------------|:----------:|-------------|\n| `rex.value`       | `uint8_t`  | The `rex` prefix if present (x64 only) |\n| `rex.bits.rex_b`  | `uint8_t`  | `rex_b` field |\n| `rex.bits.rex_x`  | `uint8_t`  | `rex_x` field |\n| `rex.bits.rex_r`  | `uint8_t`  | `rex_r` field |\n| `rex.bits.rex_w`  | `uint8_t`  | `rex_w` field |\n\nFor more information on REX prefix, refer to section *2.2.1 REX Prefixes* of the Intel Developer Manual Vol.2 [\u003csup\u003e[2]\u003c/sup\u003e](#user-content-res2).\n\n___\n\n#### `ModRm` union\n\n| Field Name        | Type       | Description |\n|-------------------|:----------:|-------------|\n| `modrm.value`     | `uint8_t`  | The ModRm value |\n| `modrm.bits.rm`   | `uint8_t`  | The `rm` part of ModRm  |\n| `modrm.bits.reg`  | `uint8_t`  | The `reg` part of ModRm |\n| `modrm.bits.mod`  | `uint8_t`  | The `mod` part of ModRm. When mod=11b source and destination are registers, otherwise one of the operands involves memory access (displacement field) |\n\nMore information on ModRm field can be found at the section *2.1.3 ModR/M and SIB Bytes* of the Intel Developer Manual Vol.2 [\u003csup\u003e[2]\u003c/sup\u003e](#user-content-res2).\n\n___\n\n#### `SIB` union\n\n| Field Name        | Type       | Description |\n|-------------------|:----------:|-------------|\n| `sib.value`       | `uint8_t`  | If present, is the Scaled Index Base |\n| `sib.bits.base`   | `uint8_t`  | `base` field |\n| `sib.bits.index`  | `uint8_t`  | `index` field |\n| `sib.bits.scaled` | `uint8_t`  | `scaled` field |\n\nFor more information refer to section *2.1.5 Addressing-Mode Encoding of ModR/M and SIB Bytes* of the Intel Developer Manual Vol.2 [\u003csup\u003e[2]\u003c/sup\u003e](#user-content-res2).\n\n___\n\n#### `vex_info` struct\n\n| Field Name           | Type              | Description |\n|----------------------|:-----------------:|-------------|\n| `type`               | `uint8_t`         | `0xC4` used when 3-byte prefix is present or `0xC5` used when 2-byte prefix is present |\n| `vexc5b`             | `struct`          |  |\n| `_vex.val5`          | `uint8_t`         | The byte after `0xC5` with its filds described below |\n| `_vex.vexc5b.vex_pp` | `uint8_t`         | Equivalent to a SIMD prefix: `00`: none, `01`: 0x66, `02`: 0xF3, `03`: 0xF2 |\n| `_vex.vexc5b.vex_l`  | `uint8_t`         | 0 for 128-bit vector or 1 for 256-bit vector |\n| `_vex.vexc5b.vex_v`  | `uint8_t`         | An additional operand for the instruction |\n| `_vex.vexc5b.vex_r`  | `uint8_t`         |  |\n| `_vex.val4`          | `uint16_t`        |  |\n| `_vex.vexc4b.vex_pp` | `uint8_t`         |  |\n| `_vex.vexc4b.vex_l`  | `uint8_t`         |  |\n| `_vex.vexc4b.vex_v`  | `uint8_t`         |  |\n| `_vex.vexc4b.vex_r`  | `uint8_t`         |  |\n| `_vex.vexc4b.vex_m`  | `uint8_t`         | Values: 00001: implied 0F leading opcode byte, 00010: implied 0F 38 leading opcode bytes, 00011: implied 0F 3A leading opcode bytes. Other values will #UD. |\n| `_vex.vexc4b.vex_b`  | `uint8_t`         |  |\n| `_vex.vexc4b.vex_x`  | `uint8_t`         |  |\n| `_vex.vexc4b.vex_r`  | `uint8_t`         |  |\n\nFor all the details about VEX prefix look at section **2.3.5 The VEX Prefix** of the Intel Developer Manual Vol.2 [\u003csup\u003e[2]\u003c/sup\u003e](#user-content-res2).\n___\n\n## Examples\n\n#### A practical example: sum of two vectors using SIMD instruction\n\nLets have a pratical example, the sum of two vectors (using inline assembly):\n\n```C\n  // ... omitted code ...\n  int vect1[LEN] = {1,2,3,4,5,6,7,8,9,10,11,12};\n  int vect2[LEN] = {1,2,3,4,5,6,7,8,9,10,11,12};\n  int res_vect1[LEN];\n\n  __asm\n  {\n    lea      eax, vect1\n    lea      ebx, vect2\n    xor      ecx, ecx\n\n    _while:\n    cmp      ecx, LEN * 4\n    jge      _end\n\n      movups   xmm0, [eax + ecx]\n      movups   xmm1, [ebx + ecx]\n      addps    xmm0, xmm1\n      movups   [res_vect1 + ecx], xmm0\n      add      ecx, 4\n      jmp      _while\n\n    _end:  \n  }\n  // ... omitted code ...\n```\n\nCompiling through MS Compiler (with `/Ot` flag), the result will be what follows:\n\n```Assembly\nCPU Disasm\nAddress   Hex dump                    Command                                  Comments\n008910BC  |.  C785 68FFFFFF 00000000  MOV DWORD PTR SS:[LOCAL.38],0\n008910C6  |.  8D45 CC                 LEA EAX,[LOCAL.13]\n008910C9  |.  8D5D 9C                 LEA EBX,[LOCAL.25]\n008910CC  |.  33C9                    XOR ECX,ECX\n008910CE  |\u003e  83F9 30                 /CMP ECX,30\n008910D1  |.  7D 18                   |JGE SHORT 008910EB\n008910D3  |.  0F100408                |MOVUPS XMM0,DQWORD PTR DS:[ECX+EAX]\n008910D7  |.  0F100C0B                |MOVUPS XMM1,DQWORD PTR DS:[ECX+EBX]\n008910DB  |.  0F58C1                  |ADDPS XMM0,XMM1\n008910DE  |.  0F11840D 6CFFFFFF       |MOVUPS DQWORD PTR SS:[ECX+EBP-94],XMM0\n008910E6  |.  83C1 04                 |ADD ECX,4\n008910E9  |.^ EB E3                   \\JMP SHORT 008910CE\n\n```\n\nWe can write a sample code that uses x64ID to read and print the instructions.\n\n```C\nint offset = 0x4bc;\nint parse_bytes = 0x2a;\nint byte_reads = 0;\n\nwhile(byte_reads \u003c= parse_bytes) {\n    struct instruction instr;\n    x64id_decode(\u0026instr, arch, (char*)data_buffer, offset);\n    \n    for(int i=0; i\u003cinstr.length; i++)\n        printf(\"%02X \", instr.instr[i]);\n        \n    printf(\"\\n\");\n    offset += instr.length;\n    byte_reads += instr.length;\n}\n```\n\nThis is what gets printed in output by giving to it the \"sum of two vectors\" code above (each line is an instruction):\n\n```\nC7 85 68 FF FF FF 00 00 00 00\n8D 45 CC\n8D 5D 9C\n33 C9\n83 F9 30\n7D 18\n0F 10 04 08\n0F 10 0C 0B\n0F 58 C1\n0F 11 84 0D 6C FF FF FF\n83 C1 04\n```\n\nOf course you can gather more information about each instruction.\nHere below a sample detailed report created by x64ID processing of two of the instructions of the set above, inst. 1 and inst. 10:\n\n```\n/**\n * Ref only - Instruction Line 1:\n * MOV DWORD PTR SS:[LOCAL.38],0\n */\n\nRAW bytes (hex): C7 85 68 FF FF FF 00 00 00 00\nInstr. length: 10\nPrint instruction fields:\n        Located Prefixes 0:\n\n        OP: 0xC7\n        mod_reg_rm: 0x85\n        disp (4): 0xFFFFFF68\n        Iimm: 0x0\n\n\n/**\n * Ref only - Instruction Line 10:\n * MOVUPS DQWORD PTR SS:[ECX+EBP-94],XMM0\n */\n\nRAW bytes (hex): 0F 11 84 0D 6C FF FF FF\nInstr. length: 8\nPrint instruction fields:\n        Located Prefixes 1:\n                0xF\n        OP: 0x11\n        mod_reg_rm: 0x84\n        SIB byte: 0xD\n        disp (4): 0xFFFFFF6C\n```\n\n___\n\n#### Another example: architecture x64, VEX prefix with YMM register\n\n`vmovsldup  ymm1, [rbp*4 + var]`\n\nAs compiled output we'll get:\n\n`C5 FE 12 0C AD 00 10 00 00`\n\nOutput after x64ID parsing:\n\n```\nRAW bytes (hex): C5 FE 12 0C AD 00 10 00 00\nInstr. length: 9\nPrint instruction fields:\n        Located Prefixes 0:\n\n        VEX prefix value:\n                0xC5 0xFE\n        Field 0xFE:\n                r: 1\n                v: F\n                L: 1\n                pp: 2\n\n        OP: 0x12\n        mod_reg_rm: 0xC\n        SIB byte: 0xAD\n        disp (4): 0x1000\n```\n\n## Enabling / Disabling features\n\nSome features can be toggled by adding / removing comments on these lines:\n\n```C\n#define  _ENABLE_RAW_BYTES\n#define  _ENABLE_VEX_INFO\n```\n\n| Define              | Description |\n|:-------------------:|:------------|\n| `_ENABLE_RAW_BYTES` | Enabling this, allows storing each instruction into the `instr` struct passed to `x64id_decode` (`instr.instr`); |\n| `_ENABLE_VEX_INFO`  | Enabling this, allows storing VEX infos into `instr`; see previous example for more |\n\n## Find length of a function\nAn extension has been added to compute the length of a specified function:\n```C\npFunctionInfo getFunctionLength(char *buffer, enum supported_architecture arch);\n```\n`pFunctionInfo` is an anonymous struct defined as follows:\n\n| Field Name        | Type       | Description |\n|-------------------|:----------:|-------------|\n| `pVisited`     | `vector *`  | A pointer to a `vector` data structure |\n| `length`   | `int`  | The length of the function in bytes  |\n\nThe `vector` data structure is a dynamic array with three members:\n\n| Field Name        | Type       | Description |\n|-------------------|:----------:|-------------|\n| `vect`     | `uint32_t *`  | contains the detected addresses  |\n| `size`   | `int`  | allocated memory of the array |\n| `tos`   | `int`  | index of the last inserted element |\n\nAn example can be found in `main.c`, function `in_memory()`.\n\n\u003e :information_source: **Notes**\n\u003e Jump Table are not handled; be careful when you use switch case and compiling with MSVC (GCC/MinGw seems use other techniques).\n\u003e Handling jmp table require heuristics (eg. as IDA do and other tools) and more info on the target.\n## Tests\n\nAfter googling for a better solution, I came back with one of the first things I was thinking: assembly.\n\nTests have been written using NASM and must be compiled using the \"bin\" flag:\n\n```sh\nnasm -f bin \u003cfilename.asm\u003e\n```\n\nThe tested instructions are the following:\n\n* x86: `1-byte OP`, `2-byte OP`, `3-byte OP` and `2-byte OP with VEX prefix`\n* x64: `1-byte OP`, `2-byte OP`, `3-byte OP`; some of which have `VEX prefix`\n\nTests have been written by hand using the Intel Developer Manual book [\u003csup\u003e[2]\u003c/sup\u003e](#user-content-res2).\nI can't guarantee a 100% coverage, however all the opcodes have been tested.\n\n## Useful resources\n\n- [X86-64 Instruction Encoding](https://wiki.osdev.org/X86-64_Instruction_Encoding)\n- [x86_64 Instruction Table](https://c9x.me/x86/)\n- [Compiler Explorer](https://godbolt.org/)\n\n## Notes\n\n\u003csup id=\"res1\"\u003e[1]\u003c/sup\u003e By VM protection is meant a code obscator that converts x86/x64 machine code into \"virtual opcodes\" that are understandable by a VM. Two commercial examples can be [VMProtect](https://vmpsoft.com/) and [CodeVirtualizer](https://www.oreans.com/codevirtualizer.php)\n\n\u003csup id=\"res2\"\u003e[2]\u003c/sup\u003e [Intel Developer Manual (2nd book)](https://software.intel.com/content/dam/develop/public/us/en/documents/334569-sdm-vol-2d.pdf)\n\n___\n\n_Crafted with ❤ by DispatchCode. Documentation created along with [Alexander Cerutti](https://github.com/alexandercerutti)_\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdispatchcode%2Fx64-instruction-decoder","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdispatchcode%2Fx64-instruction-decoder","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdispatchcode%2Fx64-instruction-decoder/lists"}