{"id":24047709,"url":"https://github.com/kernelmaker/ibdninja","last_synced_at":"2025-04-22T13:19:48.449Z","repository":{"id":271370484,"uuid":"912190338","full_name":"KernelMaker/ibdNinja","owner":"KernelMaker","description":"A powerful C++ tool for parsing and analyzing MySQL 8.0 (.ibd) data files","archived":false,"fork":false,"pushed_at":"2025-01-18T20:25:14.000Z","size":745,"stargazers_count":31,"open_issues_count":0,"forks_count":9,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-22T13:19:41.109Z","etag":null,"topics":["innodb","innodb-row-format","mysql","mysql-data-dictionary"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/KernelMaker.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-01-04T21:26:45.000Z","updated_at":"2025-04-14T02:03:40.000Z","dependencies_parsed_at":"2025-01-07T10:21:24.745Z","dependency_job_id":"e728461e-59c2-49a6-b8ea-2d0dc091d10f","html_url":"https://github.com/KernelMaker/ibdNinja","commit_stats":null,"previous_names":["kernelmaker/ibdninja"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KernelMaker%2FibdNinja","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KernelMaker%2FibdNinja/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KernelMaker%2FibdNinja/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KernelMaker%2FibdNinja/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/KernelMaker","download_url":"https://codeload.github.com/KernelMaker/ibdNinja/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250246731,"owners_count":21398919,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["innodb","innodb-row-format","mysql","mysql-data-dictionary"],"created_at":"2025-01-09T00:50:34.942Z","updated_at":"2025-04-22T13:19:48.442Z","avatar_url":"https://github.com/KernelMaker.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ibdNinja 🥷\nA powerful C++ tool for parsing and analyzing MySQL 8.0 (.ibd) data files\n\n[中文](https://github.com/KernelMaker/ibdNinja/blob/main/README_CN.md)\n\n**Contents**\n\n**1. Key Features of ibdNinja**\n\n**2. Examples of ibdNinja Usage**\n\n**3. Highlight: Parsing Records with Instant Add/Drop Columns**\n\n**4. Limitations**\n\n\n\n# 1. Key Features of ibdNinja\n\n### 1. Parsing SDI Metadata\n\nExtracts and analyzes the dictionary information of all tables and indexes contained in an ibd file from its SDI (Serialized Dictionary Information).\n\n### 2. Dynamic Parsing of Records Across Multiple Table Definition Versions *\n\nWith the parsed dictionary information, ibdNinja supports parsing and printing **any record** from **any page** of **any index\\*** in **any table\\*** (supporting all column types).\n**Moreover, it can dynamically adapt to parse records in tables with multiple coexisting schema versions caused by repeated `instant add column` and `instant drop column` operations.**\n\n***Detailed explanations and examples are provided in*** [Section 3](#third-section)\n\n\n### 3. Multi-Dimensional Data Analysis\n\nPowered by its record parsing capabilities, ibdNinja enables comprehensive data analysis across multiple levels, including Record, Page, Index, and Table levels. It computes and presents multi-dimensional statistics:\n\n**Record Level:**\n\n- Total size of the record (header + body), the number of fields, and whether the record contains a deleted mark.\n- Hexadecimal content of the header.\n- Detailed information for each field (including user-defined columns, system columns, and instant added/dropped columns), such as:\n  - Field name\n  - Field size in bytes\n  - Field type\n  - **Hexadecimal content of the field value**\n\n**Page Level:**\n\n- The number of valid records, their total size, and the percentage of page space they occupy.\n- The count of records containing `instant dropped columns` and the size and page space percentage of these dropped but still allocated columns.\n- The count, total size, and page space percentage of records marked as deleted.\n- The space utilized internally by InnoDB (e.g., page header, **record headers**, page directory), along with its percentage of the page.\n- The size and percentage of free space within the page.\n\n**Index Level:**\n\n- For a specific index, analyzes and aggregates statistics for all its pages starting from the root page.\n- Statistics are presented separately for non-leaf levels and leaf levels, similar to the statistics provided at the page level.\n\n**Table Level:**\n\n- For a given table, starts from its primary index and analyzes each index to display its statistics\n\n### 4. Printing Leftmost Pages of Each Index Level:\n\nAllows users to print the leftmost page number of each level for a specified index, making it easier to manually traverse and print every record in the index page by page.\n\n### 5. [TODO] Repairing Corrupted ibd Files\n\nWith ibdNinja's capability to parse records, it is possible to address ibd files with corrupted index pages. By removing damaged records from pages or excluding corrupted pages from indexes, the tool can attempt to recover the file to the greatest extent possible.\n\n\n# 2. Examples of ibdNinja Usage\n\n**Compiling is straightforward—just run `make` in the current directory.**\n\n### 1. Display Help Information (`--help`, `-h`)\n\n\u003cimg src=\"https://github.com/KernelMaker/kernelmaker.github.io/blob/master/public/images/ibdNinja-diagram/1.png\" alt=\"image-1\" width=\"80%\" /\u003e\n\n### 2. List Tables and Indexes in the ibd File (`--list-tables`, `-l`)\n\nUsing the system tablespace file **mysql.ibd** as an example, after specifying the file with the `--file` or `-f` option, the output provides:\n\n\u003cimg src=\"https://github.com/KernelMaker/kernelmaker.github.io/blob/master/public/images/ibdNinja-diagram/2.png\" alt=\"image-2\" width=\"60%\" /\u003e\n\n1. A summary of the ibd file, including the number of tables and indexes successfully parsed and loaded from the data dictionary.\n2. The table IDs and names of all tables in the file.\n3. For each table, all index IDs, root page numbers, and index names.\n\nWith this information, you can explore the ibd file further using other commands.\n\n### 3. Parse and Print a Specific Page (`--parse-page`, `-p PAGE_ID`)\n\nContinuing with **mysql.ibd** as an example, let’s parse the root page of the `PRIMARY` index for the `mysql.innodb_index_stats` table (its root page number is 7, as shown in the previous example).\n\nRun the following command:\n\n```\n./ibdNinja -f ../innodb-run/mysqld/data/mysql.ibd -p 7\n```\n\nThe output consists of three parts:\n\n1. **Page Summary:** Information such as sibling page numbers (left and right), the index the page belongs to, the page level, etc.\n\n\u003cimg src=\"https://github.com/KernelMaker/kernelmaker.github.io/blob/master/public/images/ibdNinja-diagram/3.png\" alt=\"image-3\" width=\"60%\" /\u003e\n\n2. **Record Details:** For each record in the page, details like:\n\n\u003cimg src=\"https://github.com/KernelMaker/kernelmaker.github.io/blob/master/public/images/ibdNinja-diagram/4.png\" alt=\"image-4\" width=\"60%\" /\u003e\n\n- Total length of the record (header + body), field count, and whether it has a delete mark.\n- A hexadecimal dump of the record header.\n- Detailed information for each field (e.g., name, length, type, and the hexadecimal value).\n\n### 3. Page Analysis Summary:\n\n Includes statistics such as:\n\n\u003cimg src=\"https://github.com/KernelMaker/kernelmaker.github.io/blob/master/public/images/ibdNinja-diagram/5.png\" alt=\"image-5\" width=\"60%\" /\u003e\n\n- Number, total size, and space usage percentage of valid records.\n- Number and size of records with `instant dropped columns`, as well as their space usage percentage.\n- Number and size of delete-marked records, and their space usage percentage.\n- Space used by InnoDB internal components (e.g., page headers), along with their percentages.\n- Free space size and percentage.\n\n### 4. Analyze a Specific Index (`--analyze-index`, `-i INDEX_ID`)\n\nUsing **mysql.ibd** again, first obtain the table and index information using the `--list-tables` (`-l`) command.\n\n\u003cimg src=\"https://github.com/KernelMaker/kernelmaker.github.io/blob/master/public/images/ibdNinja-diagram/6.png\" alt=\"image-6\" width=\"60%\" /\u003e\n\nFor example, the `mysql.tables` table has an ID of 29 and contains 10 indexes. To analyze the `PRIMARY` index (ID 78), run:\n\n```\n./ibdNinja -f ../innodb-run/mysqld/data/mysql.ibd -i 78\n```\n\n\u003cimg src=\"https://github.com/KernelMaker/kernelmaker.github.io/blob/master/public/images/ibdNinja-diagram/7.png\" alt=\"image-7\" width=\"60%\" /\u003e\n\nibdNinja traverses the `PRIMARY` index from its root page, analyzing it level by level and page by page, then summarizes the statistics:\n\n1. **Overview:** Includes the index name, number of levels, and number of pages.\n2. **Non-Leaf Levels Statistics:** Provides page count, record count, and various space usage details.\n3. **Leaf Level Statistics:** Similar to the above, but specific to the leaf level.\n\n### 5. Analyze a Specific Table (`--analyze-table`, `-t TABLE_ID`)\n\nUsing **mysql.ibd** again, first run the `--list-tables` (`-l`) command to get table and index information.\nFor the `mysql.tables` table with an ID of 29, execute:\n\n```\n./ibdNinja -f ../innodb-run/mysqld/data/mysql.ibd -t 29\n```\n\nThis command analyzes all 10 indexes of the `mysql.tables` table and outputs their statistics. Each index's structure is similar to the output of `--analyze-index`.\n\n### 6. List the Leftmost Page Number for Each Level of an Index (`--list-leafmost-pages`, `-e INDEX_ID`)\n\nContinuing with the **mysql.ibd** example, the `PRIMARY` index of the `mysql.tables` table has an ID of 78.\nRun the following command:\n\n```\n./ibdNinja -f ../innodb-run/mysqld/data/mysql.ibd -e 78\n```\n\n\u003cimg src=\"https://github.com/KernelMaker/kernelmaker.github.io/blob/master/public/images/ibdNinja-diagram/8.png\" alt=\"image-8\" width=\"60%\" /\u003e\n\nThe output shows the leftmost page number for each level of the index. For example:\n\n- Level 1 (non-leaf) has a leftmost page number of 82.\n- Level 0 (leaf level) has a leftmost page number of 161.\n\nYou can then use the `--parse-page` (`-p PAGE_NO`) command to print detailed information for these pages. From the sibling page numbers, you can continue parsing the left and right pages to traverse the entire index.\n\n***Note:*** To skip printing record details for a page (e.g., to avoid excessive output), use the `--no-print-record` (`-n`) option along with `-p`, as in:`-p 161 -n`\n\n\n\u003ca name=\"third-section\"\u003e\u003c/a\u003e\n# 3. Highlight: Parsing Records with Instant Add/Drop Columns\n\n### Table Setup\n\nWe start by creating a table:\n\n```\nCREATE TABLE `ninja_tbl` (\n  `col_uint` int unsigned NOT NULL,\n  `col_datetime_0` datetime DEFAULT NULL,\n  `col_varchar` varchar(10) DEFAULT NULL,\n  PRIMARY KEY (`col_uint`)\n) ENGINE=InnoDB;\n```\n\nBased on the current table definition (V1), we insert one record:\n\n```\nINSERT INTO ninja_tbl values (1, NOW(), \"Row_V1\");\n```\n\nNext, we use `ALTER TABLE` to add two columns to the table:\n\n```\nALTER TABLE ninja_tbl ADD COLUMN col_datetime_6 datetime(6);\nALTER TABLE ninja_tbl ADD COLUMN col_char char(10) DEFAULT \"abc\";\n```\n\nBased on the updated table definition (V2), we insert another record:\n\n```\nINSERT INTO ninja_tbl values (2, NOW(), \"Row_V2\", NOW(), \"ibdNinja\");\n```\n\nThen, we drop two columns from the table:\n\n```\nALTER TABLE ninja_tbl DROP COLUMN col_varchar;\nALTER TABLE ninja_tbl DROP COLUMN col_char;\n```\n\nFinally, based on the updated table definition (V3), we insert a third record:\n\n```\nINSERT INTO ninja_tbl values (3, NOW(), NOW());\n```\n\n### Parsing Records with ibdNinja\n\nThrough the operations above, we constructed three different table definitions (V1, V2, V3) and inserted one record for each version. Now, let’s use ibdNinja to parse these three records. Since there are only three records, the primary key index of `ninja_tbl` must fit into a single page (root number 4). We can directly use the `-p` command to parse this page. Here, we skip most of the output and focus on the parsed records:\n\n1. **Record 1:**\n\n   \u003cimg src=\"https://github.com/KernelMaker/kernelmaker.github.io/blob/master/public/images/ibdNinja-diagram/9.png\" alt=\"image-9\" width=\"60%\" /\u003e\n\n   - **FIELD 1 (col_uint):** The value is `1`, inserted under table definition V1.\n   - **FIELD 5 (col_varchar):** This field was defined in V1 and is part of record 1. Its value is present but marked as `!hidden!_dropped_v3_p4_col_varchar` because the column was instantly dropped in V3. Although hidden from queries, the data remains in the page.\n   - **FIELD 6 (col_datetime_6):** Added in V2, this field has no value in record 1, as it did not exist when the record was inserted (length is 0).\n   - **FIELD 7 (col_char):** Also added in V2 and dropped in V3, this field has no value in record 1 for the same reason.\n\n2. **Record 2:**\n\n   \u003cimg src=\"https://github.com/KernelMaker/kernelmaker.github.io/blob/master/public/images/ibdNinja-diagram/10.png\" alt=\"image-10\" width=\"60%\" /\u003e\n\n   - **FIELD 1 (col_uint):** The value is `2`, inserted under table definition V2.\n   - **FIELD 5 (col_varchar):** This column was defined in V1 and dropped in V3. Since record 2 was inserted before the drop, it still contains a value.\n   - **FIELD 6 (col_datetime_6):** Added in V2, this field contains a value for record 2.\n   - **FIELD 7 (col_char):** Added in V2 and dropped in V3, this field also contains a value for record 2.\n\n3. **Record 3:**\n\n   \u003cimg src=\"https://github.com/KernelMaker/kernelmaker.github.io/blob/master/public/images/ibdNinja-diagram/11.png\" alt=\"image-11\" width=\"60%\" /\u003e\n\n   - **FIELD 1 (col_uint):** The value is `3`, inserted under table definition V3.\n   - **FIELD 5 and FIELD 7:** Both fields were dropped in V3. Since record 3 was inserted after the drop, these fields are empty.\n\n### Page Analysis Results\n\nThe page analysis output highlights the following key details:\n\n\u003cimg src=\"https://github.com/KernelMaker/kernelmaker.github.io/blob/master/public/images/ibdNinja-diagram/12.png\" alt=\"image-12\" width=\"60%\" /\u003e\n\n- As shown in the red box of the analysis, two records in the page still contain old values for columns that were dropped.\n- The analysis shows the total size and percentage of space wasted due to these dropped columns.\n\nThis information helps to quantify the space overhead caused by instant column drops.\n\nSimilarly, if the page contains deleted-marked records, their size and percentage are also displayed.\n\nThese statistics are not only available at the page level but can also be aggregated at the index level using the `--analyze-index` (`-i INDEX_ID`) command.\n\n# 4. Limitations\n\nThis is the first version I developed during the Christmas holiday, so there are some functional limitations and potential bugs (feel free to raise issues):\n\n1. **Supported MySQL Versions**:\n\n   Currently supports MySQL 8.0 (8.0.16 - 8.0.40).\n\n   *(Earlier versions of MySQL 8.0, prior to 8.0.16, contain a bug in SDI generation that leads to missing metadata in `dd_object::indexes::elements`.)*\n\n3. **Supported Platforms**:\n\n   Currently supports Linux and macOS.\n\n   *(I don't have Windows.)*\n\n5. **Functional Limitations**:\n\n   **Tablespace:**\n   - Encrypted tablespaces are not yet supported.\n\n   **Table:**\n   - Tables using table compression or page compression are not yet supported.\n   - Encrypted tables are not yet supported.\n   - Partition tables are not yet supported.\n   - Auxiliary and common index tables of FTS are not yet supported.\n\n   **Index:**\n   - Full-text indexes are not yet supported (only `FTS_DOC_ID_INDEX` is supported).\n   - Spatial indexes are not yet supported.\n   - Indexes using virtual columns as key columns are not yet supported.\n\n   **Page:**\n   - Only `INDEX` pages (those in B+Tree) are currently supported.\n\n   **Record:**\n   - Records in the `redundant` row format are not yet supported.\n\n*Note: The analysis in ibdNinja is currently based on the InnoDB data pages written to the ibd file. Pages in the redo log that have not yet been flushed to the ibd file are not included in the statistics. In scenarios with a large number of dirty pages, the analysis results may have some deviations or errors.*\n\n\n\nSpecial thanks to [MySQL](https://github.com/mysql/mysql-server) for being an invaluable reference in developing ibdNinja.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkernelmaker%2Fibdninja","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkernelmaker%2Fibdninja","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkernelmaker%2Fibdninja/lists"}