{"id":14990691,"url":"https://github.com/ziglang/glibc-abi-tool","last_synced_at":"2025-04-04T18:05:00.585Z","repository":{"id":39651099,"uuid":"437161360","full_name":"ziglang/glibc-abi-tool","owner":"ziglang","description":"A repository that collects glibc .abilist files for every version and a tool to combine them into one dataset.","archived":false,"fork":false,"pushed_at":"2025-01-31T13:05:42.000Z","size":1872,"stargazers_count":172,"open_issues_count":0,"forks_count":12,"subscribers_count":7,"default_branch":"main","last_synced_at":"2025-03-28T17:06:28.859Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Zig","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ziglang.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-12-11T01:45:30.000Z","updated_at":"2025-03-21T21:25:52.000Z","dependencies_parsed_at":"2025-03-21T16:12:49.515Z","dependency_job_id":null,"html_url":"https://github.com/ziglang/glibc-abi-tool","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ziglang%2Fglibc-abi-tool","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ziglang%2Fglibc-abi-tool/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ziglang%2Fglibc-abi-tool/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ziglang%2Fglibc-abi-tool/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ziglang","download_url":"https://codeload.github.com/ziglang/glibc-abi-tool/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247226213,"owners_count":20904465,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-09-24T14:20:36.289Z","updated_at":"2025-04-04T18:04:56.757Z","avatar_url":"https://github.com/ziglang.png","language":"Zig","readme":"# glibc ABI Tool\n\nThis repository contains `.abilist` files from every version of glibc. These\nfiles are consolidated to generate a single 241 KB symbol mapping file that is\nshipped with Zig to target any version of glibc. This repository is for Zig\nmaintainers to use when a new glibc version is tagged upstream; Zig users have\nno need for this repository.\n\n## Adding new glibc version `.abilist` files\n\n1. Clone glibc\n\n```sh\ngit clone git://sourceware.org/git/glibc.git\n```\n\n2. Check out the new glibc version git tag, e.g. `glibc-2.39`.\n\n3. Run the tool to grab the new abilist files:\n\n```sh\nzig run collect_abilist_files.zig -- $GLIBC_GIT_REPO_PATH\n```\n\n4. This mirrors the directory structure into the `glibc` subdirectory,\n   namespaced under the version number, but only copying files with the\n   .abilist extension.\n\n5. Inspect the changes and then commit these new files into git.\n\n## Updating Zig\n\n1. Add the new glibc versions to the `versions` global constant.\n\n2. Run `consolidate.zig` at the root of this repo.\n\n```sh\nzig run consolidate.zig\n```\n\nThis will generate the file `abilists` which you can then inspect and make sure\nit is OK. Copy it to `$ZIG_GIT_REPO_PATH/lib/libc/glibc/abilist`.\n\n## Debugging an abilists file\n\n```sh\nzig run list_symbols.zig -- abilists\n```\n\n## Strategy\n\nThe abilist files from the latest glibc are *almost* enough to completely\nencode all the information that we need to generate the symbols db. The only\nproblem is when a function migrates from one library to another. For example,\nin glibc 2.32, the function `pthread_sigmask` migrated from libpthread to libc,\nand the latest abilist files only show it in libc. However, if a user targets\nglibc 2.31, Zig needs to know to put the symbol into libpthread.so and not\nlibc.so.\n\nIn glibc upstream, they simply renamed the abilist files from pthread.abilist to\nlibc.abilist. This resulted in the following line being present in libc.abilist\nin glibc 2.32 and later:\n\n```\nGLIBC_2.0 pthread_sigmask F\n```\n\nThis implies that in glibc 2.0, libc.so has the `pthread_sigmask` symbol, which\nis incorrect, because it was only found in libpthread.so.\n\nThis is why this repository contains abilist files from all past\nversions of glibc as well as the most recent one - it allows us to\ndetect this situation, and generate a corrected symbols database.\n\nThe strategy is to start with the earliest glibc version, consume the abilist\nfiles, and then treat that data as correct. Next we move on to the next\nearliest glibc version, but now we have to detect a contradiction: if the newer\nglibc version claims that e.g. `pthread_sigmask` is available in glibc 2.0,\nwhen our correct data says that it does not, we ignore that incorrect piece of\ndata. However we must take in new data if the version it talks about is greater\nthan the version corresponding to the \"correct\" data set.\n\nAfter merging in the newer glibc version, we mark the current dataset as\n\"correct\" and move on to the next, and so on until we have processed all the\nsets of abilist files.\n\nWhen this process completes, we have in memory something that looks like this:\n\n* For each glibc symbol\n  * For each glibc library\n    * For each target\n      * For each glibc version\n        * Whether the symbol is absent, a function, or an object+size\n\nAnd our job is now to *encode* this information into a file that does not waste\ninstallation size and yet remains simple to decode and use in the Zig compiler.\n\n### Inclusions\n\nNext, the script generates the minimal number of \"inclusions\" to encode all the\ninformation. An \"inclusion\" is:\n\n * A symbol name.\n * The set of targets this inclusion applies to.\n * The set of glibc versions this inclusion applies to.\n * The set of libraries this inclusion applies to.\n * Whether it is a function or object, and if an object, its size in bytes.\n\nAs an example, consider `dlopen`. An inclusion is something like this:\n\n * `dlopen`\n * targets: aarch64-linux-gnu powerpc64le-linux-gnu\n * versions: 2.17 2.34\n * libraries: libdl.so\n * type: function\n\nThis does not cover all the places `dlopen` can be found however. There will\nneed to be more inclusions for more targets, for example:\n\n * `dlopen`\n * targets: x86_64-linux-gnu\n * versions: 2.2.5 2.34\n * libraries: libdl.so\n * type: function\n\nNow we have more coverage of all the places `dlopen` can be found, but there are\nyet more that need to be emitted. The script emits as many inclusions as\nnecessary so that all the information is represented.\n\nNext we make few observations which lead to a more compact data encoding.\n\n### Observation: All symbols are consistently either functions or objects\n\nThere is no symbol that is a function on one target, and an object on another\ntarget. Similarly there is no symbol that is a function on one glibc version,\nbut an object in another, and there is no symbol that is a function in one\nshared library, but an object in another.\n\nWe exploit this by encoding functions and object symbols in separate lists.\n\n### Observation: Over half of the objects are exactly 4 bytes\n\n51% of all object entries are 4 bytes, and 68% of all object entries are either\n4 or 8 bytes.\n\nTotal object inclusions are 765. If we stored 4 and 8 byte objects in separate\nlists, this would save 2 bytes from 520 inclusions, totaling 1 KB. Not worth.\n\n### Observation: Average number of different versions per inclusion is 1.02\n\nNearly every inclusion has typically 1 version attached to it, rarely more.\nThis makes a u64 bitset uneconomical. With 19530 total inclusions, this comes\nout to 153 KB spent on the version bitset. However if we encoded it as one byte\nper version, using 1 bit of the byte to indicate the terminal item, this would\nbring the 153 KB down to 19 KB. That is almost a 50% reduction from the total\nsize of the encoded abilists file. Definitely worth it.\n\n## Binary encoding format:\n\nAll integers are stored little-endian.\n\n- u8 number of glibc libraries (7). For each:\n  - null-terminated name, e.g. \"c\", \"m\", \"dl\", \"ld\", \"pthread\"\n- u8 number of glibc versions (44), sorted ascending. For each:\n  - u8 major\n  - u8 minor\n  - u8 patch\n- u8 number of targets (27). For each:\n  - null-terminated target triple\n- u16 number of function inclusions (24536)\n  - null-terminated symbol name (not repeated for subsequent same symbol inclusions)\n  - Set of Unsized Inclusions\n- u16 number of object inclusions (912)\n  - null-terminated symbol name (not repeated for subsequent same symbol inclusions)\n  - Set of Sized Inclusions\n\nSet of Unsized Inclusions:\n  - uleb128 (u64) set of targets this inclusion applies to (1 \u003c\u003c INDEX_IN_TARGET_LIST)\n  - u8 index of glibc library this inclusion applies to\n    - last inclusion is indicated if 1 \u003c\u003c 7 bit is set in library index\n  - [N]u8 set of glibc versions this inclusion applies to. MSB set indicates last.\n\nSet of Sized Inclusions:\n  - uleb128 (u64) set of targets this inclusion applies to (1 \u003c\u003c INDEX_IN_TARGET_LIST)\n  - uleb128 (u16) object size\n  - u8 index of glibc library this inclusion applies to\n    - last inclusion is indicated if 1 \u003c\u003c 7 bit is set in library index\n  - [N]u8 set of glibc versions this inclusion applies to. MSB set indicates last.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fziglang%2Fglibc-abi-tool","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fziglang%2Fglibc-abi-tool","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fziglang%2Fglibc-abi-tool/lists"}