{"id":23739628,"url":"https://github.com/graysky2/kernel_compiler_patch","last_synced_at":"2025-12-24T14:18:39.312Z","repository":{"id":6178348,"uuid":"7408505","full_name":"graysky2/kernel_compiler_patch","owner":"graysky2","description":"Kernel patch enables compiler optimizations for additional CPUs.","archived":false,"fork":false,"pushed_at":"2025-08-19T08:34:37.000Z","size":268,"stargazers_count":703,"open_issues_count":3,"forks_count":80,"subscribers_count":71,"default_branch":"master","last_synced_at":"2025-09-04T15:46:24.472Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/graysky2.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2013-01-02T15:15:32.000Z","updated_at":"2025-08-19T08:34:41.000Z","dependencies_parsed_at":"2023-12-27T13:35:34.546Z","dependency_job_id":"60a556dc-9e4f-4edf-a1b3-2d4e3276724c","html_url":"https://github.com/graysky2/kernel_compiler_patch","commit_stats":{"total_commits":144,"total_committers":16,"mean_commits":9.0,"dds":"0.17361111111111116","last_synced_commit":"9d14420af9da0dc0715d769232e0bcd8fa16b096"},"previous_names":[],"tags_count":41,"template":false,"template_full_name":null,"purl":"pkg:github/graysky2/kernel_compiler_patch","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/graysky2%2Fkernel_compiler_patch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/graysky2%2Fkernel_compiler_patch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/graysky2%2Fkernel_compiler_patch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/graysky2%2Fkernel_compiler_patch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/graysky2","download_url":"https://codeload.github.com/graysky2/kernel_compiler_patch/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/graysky2%2Fkernel_compiler_patch/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28003727,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-12-24T02:00:07.193Z","response_time":83,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-31T09:37:03.936Z","updated_at":"2025-12-24T14:18:39.296Z","avatar_url":"https://github.com/graysky2.png","language":"Shell","funding_links":[],"categories":["Shell"],"sub_categories":[],"readme":"# kernel_compiler_patch\n\n## Too lazy to update readme\n\nThe option to build a kernel with [-march=native](https://github.com/torvalds/linux/commit/914873bc7df913db988284876c16257e6ab772c6) was merged and included in kernel version 6.16. This makes the patch in [section 2](https://github.com/graysky2/kernel_compiler_patch?tab=readme-ov-file#2-new-micro-architectures-levels) only needed if you want to build an optimzed kernel (say for zen5) on a machine that is not a match to that choice. I am too lazy to re-write this readme.\n\nThe ISA patch in [secton 1](https://github.com/graysky2/kernel_compiler_patch?tab=readme-ov-file#1-new-generic-x86-64-isa-levels) is still relevant if you want to build with `-march=x86-64-v2` or `-march=x86-64-v3` for generic kernels that will run on numerous supported CPUs.\n\n## Why a specific patch?\nThe kernel uses its own set of CFLAGS, KCFLAGS. For example, see:\n* [arch/x86/Makefile](https://github.com/torvalds/linux/blob/master/arch/x86/Makefile)\n* [arch/x86/Makefile_32.cpu](https://github.com/torvalds/linux/blob/master/arch/x86/Makefile_32.cpu)\n* [arch/x86/Kconfig.cpu](https://github.com/torvalds/linux/blob/master/arch/x86/Kconfig.cpu)\n\n\n### Alternative way to define a -march= option without this patch\nAs pointed out by codemac in [this topic](https://bbs.archlinux.org/viewtopic.php?id=281639), one can simply export the value/values for the `KCFLAGS` and `KCPPFLAGS` before calling `make` to achieve the same result, see [here](https://github.com/torvalds/linux/blob/88603b6dc419445847923fcb7fe5080067a30f98/Makefile#L1112).\n```\nexport KCFLAGS=' -march=znver3'\nexport KCPPFLAGS=' -march=znver3'\nmake all\n```\n\n## New tunings\nThese patches adds additional tunings via new x86-64 ISA levels and more micro-architecture options to the Linux kernel in three broad classes.\n\n### 1. New generic x86-64 ISA levels\n\nWhen compiling the `Generic x86-64` Processor family target, these are selectable under:\n```\n Processor type and features ---\u003e x86-64 compiler ISA level\n```\n\n* x86-64     A value of (1) is the default and builds with the generic x86-64 ISA level\n* x86-64-v2  A value of (2) brings support for vector instructions up to Streaming SIMD Extensions 4.2 (SSE4.2) and Supplemental Streaming SIMD Extensions 3(SSSE3), the POPCNT instruction, and CMPXCHG16B.\n* x86-64-v3  A value of (3) adds vector instructions up to AVX2, MOVBE, and additional bit-manipulation instructions.\n\nx86-64-v4 does exist but it adds vector instructions from some of the AVX-512 variants which the kernel does not use so including it does not make much sense.\n\nUsers of glibc 2.33 and above can see which level is supported by running one of the follownig:\n```\n/lib/ld-linux-x86-64.so.2 --help | grep supported\n/lib64/ld-linux-x86-64.so.2 --help | grep supported\n```\n### 2. New micro-architectures levels\n\nThese are selectable under:\n```\n Processor type and features ---\u003e Processor family\n```\n\u003ctable\u003e\n  \u003ctr\u003e\n    \u003cth\u003eCPU Family\u003c/th\u003e\n    \u003cth\u003e-march=\u003c/th\u003e\n    \u003cth\u003eMin GCC Ver\u003c/th\u003e\n    \u003cth\u003eMin Clang Ver\u003c/th\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eAMD Improved K8-family\u003c/td\u003e\n    \u003ctd\u003ek8-sse3\u003c/td\u003e\n    \u003ctd\u003e9.3\u003c/td\u003e\n    \u003ctd\u003e9.0\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eAMD K10-family\u003c/td\u003e\n    \u003ctd\u003eamdfam10\u003c/td\u003e\n    \u003ctd\u003e9.3\u003c/td\u003e\n    \u003ctd\u003e9.0\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eAMD Family 10h (Barcelona)\u003c/td\u003e\n    \u003ctd\u003ebarcelona\u003c/td\u003e\n    \u003ctd\u003e9.3\u003c/td\u003e\n    \u003ctd\u003e9.0\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eAMD Family 14h (Bobcat)\u003c/td\u003e\n    \u003ctd\u003ebtver1\u003c/td\u003e\n    \u003ctd\u003e9.3\u003c/td\u003e\n    \u003ctd\u003e9.0\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eAMD Family 16h (Jaguar)\u003c/td\u003e\n    \u003ctd\u003ebtver2\u003c/td\u003e\n    \u003ctd\u003e9.3\u003c/td\u003e\n    \u003ctd\u003e9.0\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eAMD Family 15h (Bulldozer)\u003c/td\u003e\n    \u003ctd\u003ebdver1\u003c/td\u003e\n    \u003ctd\u003e9.3\u003c/td\u003e\n    \u003ctd\u003e9.0\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eAMD Family 15h (Piledriver)\u003c/td\u003e\n    \u003ctd\u003ebdver2\u003c/td\u003e\n    \u003ctd\u003e9.3\u003c/td\u003e\n    \u003ctd\u003e9.0\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eAMD Family 15h (Steamroller)\u003c/td\u003e\n    \u003ctd\u003ebdver3\u003c/td\u003e\n    \u003ctd\u003e9.3\u003c/td\u003e\n    \u003ctd\u003e9.0\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eAMD Family 15h (Excavator)\u003c/td\u003e\n    \u003ctd\u003ebdver4\u003c/td\u003e\n    \u003ctd\u003e9.3\u003c/td\u003e\n    \u003ctd\u003e9.0\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eAMD Family 17h (Zen)\u003c/td\u003e\n    \u003ctd\u003eznver1\u003c/td\u003e\n    \u003ctd\u003e9.3\u003c/td\u003e\n    \u003ctd\u003e9.0\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eAMD Family 17h (Zen 2)\u003c/td\u003e\n    \u003ctd\u003eznver2\u003c/td\u003e\n    \u003ctd\u003e9.3\u003c/td\u003e\n    \u003ctd\u003e9.0\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eAMD Family 19h (Zen 3)\u003c/td\u003e\n    \u003ctd\u003eznver3\u003c/td\u003e\n    \u003ctd\u003e10.3\u003c/td\u003e\n    \u003ctd\u003e12.0\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eAMD Family 19h (Zen 4)\u003c/td\u003e\n    \u003ctd\u003eznver4\u003c/td\u003e\n    \u003ctd\u003e13.0\u003c/td\u003e\n    \u003ctd\u003e17.0\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eAMD Family 19h (Zen 5)\u003c/td\u003e\n    \u003ctd\u003eznver5\u003c/td\u003e\n    \u003ctd\u003e14.1\u003c/td\u003e\n    \u003ctd\u003e19.1\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eIntel Bonnell family Atom\u003c/td\u003e\n    \u003ctd\u003ebonnell\u003c/td\u003e\n    \u003ctd\u003e9.3\u003c/td\u003e\n    \u003ctd\u003e9.0\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eIntel Silvermont family Atom\u003c/td\u003e\n    \u003ctd\u003esilvermont\u003c/td\u003e\n    \u003ctd\u003e9.3\u003c/td\u003e\n    \u003ctd\u003e9.0\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eIntel Goldmont family Atom (Apollo Lake and Denverton)\u003c/td\u003e\n    \u003ctd\u003egoldmont\u003c/td\u003e\n    \u003ctd\u003e9.3\u003c/td\u003e\n    \u003ctd\u003e9.0\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eIntel Goldmont Plus family Atom (Gemini Lake)\u003c/td\u003e\n    \u003ctd\u003egoldmont-plus\u003c/td\u003e\n    \u003ctd\u003e9.3\u003c/td\u003e\n    \u003ctd\u003e9.0\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eIntel 1st Gen Core i3/i5/i7-family (Nehalem)\u003c/td\u003e\n    \u003ctd\u003enehalem\u003c/td\u003e\n    \u003ctd\u003e9.3\u003c/td\u003e\n    \u003ctd\u003e9.0\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eIntel 1.5 Gen Core i3/i5/i7-family (Westmere)\u003c/td\u003e\n    \u003ctd\u003ewestmere\u003c/td\u003e\n    \u003ctd\u003e9.3\u003c/td\u003e\n    \u003ctd\u003e9.0\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eIntel 2nd Gen Core i3/i5/i7-family (Sandybridge)\u003c/td\u003e\n    \u003ctd\u003esandybridge\u003c/td\u003e\n    \u003ctd\u003e9.3\u003c/td\u003e\n    \u003ctd\u003e9.0\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eIntel 3rd Gen Core i3/i5/i7-family (Ivybridge)\u003c/td\u003e\n    \u003ctd\u003eivybridge\u003c/td\u003e\n    \u003ctd\u003e9.3\u003c/td\u003e\n    \u003ctd\u003e9.0\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eIntel 4th Gen Core i3/i5/i7-family (Haswell)\u003c/td\u003e\n    \u003ctd\u003ehaswell\u003c/td\u003e\n    \u003ctd\u003e9.3\u003c/td\u003e\n    \u003ctd\u003e9.0\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eIntel 5th Gen Core i3/i5/i7-family (Broadwell)\u003c/td\u003e\n    \u003ctd\u003ebroadwell\u003c/td\u003e\n    \u003ctd\u003e9.3\u003c/td\u003e\n    \u003ctd\u003e9.0\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eIntel 6th Gen Core i3/i5/i7-family (Skylake)\u003c/td\u003e\n    \u003ctd\u003eskylake\u003c/td\u003e\n    \u003ctd\u003e9.3\u003c/td\u003e\n    \u003ctd\u003e9.0\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eIntel 6th Gen Core i7/i9-family (Skylake X)\u003c/td\u003e\n    \u003ctd\u003eskylake-avx512\u003c/td\u003e\n    \u003ctd\u003e9.3\u003c/td\u003e\n    \u003ctd\u003e9.0\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eIntel 8th Gen Core i3/i5/i7-family (Cannon Lake)\u003c/td\u003e\n    \u003ctd\u003ecannonlake\u003c/td\u003e\n    \u003ctd\u003e9.3\u003c/td\u003e\n    \u003ctd\u003e9.0\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eIntel 10th Gen Core i7/i9-family (Ice Lake)\u003c/td\u003e\n    \u003ctd\u003eicelake-client\u003c/td\u003e\n    \u003ctd\u003e9.3\u003c/td\u003e\n    \u003ctd\u003e9.0\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eIntel Xeon (Cascade Lake)\u003c/td\u003e\n    \u003ctd\u003ecascadelake\u003c/td\u003e\n    \u003ctd\u003e10.2\u003c/td\u003e\n    \u003ctd\u003e10.0\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eIntel Xeon (Cooper Lake)\u003c/td\u003e\n    \u003ctd\u003ecooperlake\u003c/td\u003e\n    \u003ctd\u003e10.2\u003c/td\u003e\n    \u003ctd\u003e10.0\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eIntel 3rd Gen 10nm++ i3/i5/i7/i9-family (Tiger Lake)\u003c/td\u003e\n    \u003ctd\u003ecooperlake\u003c/td\u003e\n    \u003ctd\u003e10.2\u003c/td\u003e\n    \u003ctd\u003e10.0\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eIntel 4th Gen 10nm++ Xeon (Sapphire Rapids)\u003c/td\u003e\n    \u003ctd\u003esapphirerapids\u003c/td\u003e\n    \u003ctd\u003e11.1\u003c/td\u003e\n    \u003ctd\u003e12.0\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eIntel 11th Gen i3/i5/i7/i9-family (Rocket Lake)\u003c/td\u003e\n    \u003ctd\u003erocketlake\u003c/td\u003e\n    \u003ctd\u003e11.1\u003c/td\u003e\n    \u003ctd\u003e12.0\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eIntel 12th Gen i3/i5/i7/i9-family (Alder Lake)\u003c/td\u003e\n    \u003ctd\u003ealderlake\u003c/td\u003e\n    \u003ctd\u003e11.1\u003c/td\u003e\n    \u003ctd\u003e12.0\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eIntel 13th Gen i3/i5/i7/i9-family (Raptor Lake)\u003c/td\u003e\n    \u003ctd\u003eraptorlake\u003c/td\u003e\n    \u003ctd\u003e13.0\u003c/td\u003e\n    \u003ctd\u003e15.0.5\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eIntel 5th Gen 10nm++ Xeon (Emerald Rapids)\u003c/td\u003e\n    \u003ctd\u003eemeraldrapids\u003c/td\u003e\n    \u003ctd\u003e13.0\u003c/td\u003e\n    \u003ctd\u003e???\u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\n## 3. Auto-detected micro-architecture levels\n\nThese are also selectable under:\n```\n Processor type and features ---\u003e Processor family\n```\nThey have the  ability to compile by passing the '-march=native' option which, according to the [GCC manual](https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html#index-x86-Options) \"selects the CPU to generate code for at compilation time by determining the processor type of the compiling machine. Using -march=native enables all instruction subsets supported by the local machine and will produce code optimized for the local machine under the constraints of the selected instruction set.\"\n\nUsers of Intel CPUs should select the 'Intel-Native' option and users of AMD CPUs should select the 'AMD-Native' option.\n\n## Benchmarks\n### Setup\n\nThe test machine measured the time it took to `make bzImage` of the linux kernel source (`.config` generated by `make x86_64_defconfig` prior).\n\nThree separate test machines were evaluated:\n1. AMD Ryzen 9 5950X\n2. Intel i7-4790K\n3. Intel N100\n\nSeparate kernels were first compiled from source patched with [more-uarches-for-kernel-6.8-rc4+.patch](https://github.com/graysky2/kernel_compiler_patch/blob/master/more-uarches-for-kernel-6.8-rc4%2B.patch).\n* Kernel 1 used the default menu config option for Processor family = `Generic x86-64`\n* Kernel 2 used the menu config option for Processor family = `x86-64-v3`\n* Kernel 3 used the menu config option for Processor family = `AMD Zen 3` or `Intel Haswell` or `Intel Alder Lake`\n\n#### The make test\nEach machine was booted into its respective kernel and the make test was conducted.  Then the next kernel was installed and the machine was booted into it and the make test was again conducted.\n\n#### The stress-ng benchmark\nThe AMD 5950X ran `stress-ng --taskset 0-1  --metrics-brief -t 30s --foo 2` 12 times where `foo` was one of: `af-alg`, `fork`, `mmap`, or `pipe` under Kernel 1 and then again under Kernel 3.\n\n## Conclusion\nConsistently across all three test machines, the kernels built with the optimized processor family options introduced by the patch hosted in this repo ran the make test faster than the kernel compiled with the default processor family option by a small (\u003c1% difference) but statistically significant amount as measured by this make compilation.\n\nThe stress-ng testing generally showed small improvements (1-2% faster) and one showing no difference.\n\nWhat does this mean for real-world usage?  Maybe nothing.  The intent was to see if something easily automatable could show some value in applying these micro-arch tunings.  People have historically gravitated to compilation-based benchmarks so that coupled with ease-of-use point is why I settled on it.  If someone has a good kernel-centric benchmark, I am interested to see a controlled comparison.\n\n## Discussion\n1. All the assumptions for ANOVA are met:\n\t* Data are normally distributed\n\t* The population variances are fairly equal\n2. The boxplot plot clearly show significance for either pair-wise comparison\n\t* Pair-wise analysis by Tukey-Kramer data shown for all pairs (see tables)\n\nIn other words, x86-64-v3 is significantly different from generic x86-64. The various subtargets are also significantly different from x86-64.\n\n### The make test\n#### Stats for Machine 1. AMD Ryzen 9 X5950\n\u003ctable\u003e\n  \u003ctr\u003e\n    \u003cth\u003eProcessor family option\u003c/th\u003e\n    \u003cth\u003eMean compile time\u003c/th\u003e\n    \u003cth\u003eStd dev\u003c/th\u003e\n    \u003cth\u003e# of replicates\u003c/th\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eGeneric x86-64\u003c/td\u003e\n    \u003ctd\u003e79.800 sec\u003c/td\u003e\n    \u003ctd\u003e0.1076 sec\u003c/td\u003e\n   \u003ctd\u003e12\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003ex86-64-v3\u003c/td\u003e\n    \u003ctd\u003e79.456 sec\u003c/td\u003e\n    \u003ctd\u003e0.0772 sec\u003c/td\u003e\n    \u003ctd\u003e12\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eAMD Zen 3\u003c/td\u003e\n    \u003ctd\u003e79.440 sec\u003c/td\u003e\n    \u003ctd\u003e0.0912 sec\u003c/td\u003e\n    \u003ctd\u003e12\u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\n![X9550](https://github.com/graysky2/kernel_compiler_patch/blob/master/benchmark/boxplot1.svg)\n\n\u003ctable\u003e\n  \u003ctr\u003e\n    \u003cth\u003eTreatment pairs\u003c/th\u003e\n    \u003cth\u003eTukey HSD Q stat\u003c/th\u003e\n    \u003cth\u003eTukey HSD p-value\u003c/th\u003e\n    \u003cth\u003eTukey HSD interfence\u003c/th\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eGeneric x86-64 vs x86-64-v3\u003c/td\u003e\n    \u003ctd\u003e12.8771\u003c/td\u003e\n    \u003ctd\u003e0.0010053\u003c/td\u003e\n    \u003ctd\u003e$${\\color{green} \\verb|**|p\u003c0.01}$$\u003c/tr\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eGeneric x86-64 vs AMD Zen 3\u003c/td\u003e\n    \u003ctd\u003e13.4675\u003c/td\u003e\n    \u003ctd\u003e0.0010053\u003c/td\u003e\n    \u003ctd\u003e$${\\color{green} \\verb|**|p\u003c0.01}$$\u003c/tr\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003ex86-64-v3 vs AMD Zen 3\u003c/td\u003e\n    \u003ctd\u003e9.6524\u003c/td\u003e\n    \u003ctd\u003e0.8999947\u003c/td\u003e\n    \u003ctd\u003e$${\\color{red}insignificant}$$\u003c/tr\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\n#### Stats for Machine 2. Intel i7-4790K\n\u003ctable\u003e\n  \u003ctr\u003e\n    \u003cth\u003eProcessor family option\u003c/th\u003e\n    \u003cth\u003eMean compile time\u003c/th\u003e\n    \u003cth\u003eStd dev\u003c/th\u003e\n    \u003cth\u003e# of replicates\u003c/th\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eGeneric x86-64\u003c/td\u003e\n    \u003ctd\u003e344.280 sec\u003c/td\u003e\n    \u003ctd\u003e0.6455 sec\u003c/td\u003e\n    \u003ctd\u003e12\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003ex86-64-v3\u003c/td\u003e\n    \u003ctd\u003e342.035 sec\u003c/td\u003e\n    \u003ctd\u003e0.4971 sec\u003c/td\u003e\n    \u003ctd\u003e12\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eIntel Haswell\u003c/td\u003e\n    \u003ctd\u003e342.189 sec\u003c/td\u003e\n    \u003ctd\u003e0.2415 sec\u003c/td\u003e\n    \u003ctd\u003e12\u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\n![i7-4790k](https://github.com/graysky2/kernel_compiler_patch/blob/master/benchmark/boxplot2.svg)\n\n\u003ctable\u003e\n  \u003ctr\u003e\n    \u003cth\u003eTreatment pairs\u003c/th\u003e\n    \u003cth\u003eTukey HSD Q stat\u003c/th\u003e\n    \u003cth\u003eTukey HSD p-value\u003c/th\u003e\n    \u003cth\u003eTukey HSD interfence\u003c/th\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eGeneric x86-64 vs x86-64-v3\u003c/td\u003e\n    \u003ctd\u003e28.9652\u003c/td\u003e\n    \u003ctd\u003e0.0010053\u003c/td\u003e\n    \u003ctd\u003e$${\\color{green} \\verb|**|p\u003c0.01}$$\u003c/tr\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eGeneric x86-64 vs Intel Haswell\u003c/td\u003e\n    \u003ctd\u003e24.8335\u003c/td\u003e\n    \u003ctd\u003e0.0010053\u003c/td\u003e\n    \u003ctd\u003e$${\\color{green} \\verb|**|p\u003c0.01}$$\u003c/tr\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003ex86-64-v3 vs Intel Haswell\u003c/td\u003e\n    \u003ctd\u003e4.1317\u003c/td\u003e\n    \u003ctd\u003e0.0167155\u003c/td\u003e\n    \u003ctd\u003e $${\\color{lightgreen} \\verb|*|p\u003c0.05}$$\u003c/tr\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\n#### Stats for Machine 3. Intel N100\n\u003ctable\u003e\n  \u003ctr\u003e\n    \u003cth\u003eProcessor family option\u003c/th\u003e\n    \u003cth\u003eMean compile time\u003c/th\u003e\n    \u003cth\u003eStd dev\u003c/th\u003e\n    \u003cth\u003e# of replicates\u003c/th\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eGeneric x86-64\u003c/td\u003e\n    \u003ctd\u003e589.457 sec\u003c/td\u003e\n    \u003ctd\u003e0.1596 sec\u003c/td\u003e\n    \u003ctd\u003e12\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003ex86-64-v3\u003c/td\u003e\n    \u003ctd\u003e589.217 sec\u003c/td\u003e\n    \u003ctd\u003e0.1382  sec\u003c/td\u003e\n    \u003ctd\u003e12\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eIntel Alder Lake\u003c/td\u003e\n    \u003ctd\u003e588.797 sec\u003c/td\u003e\n    \u003ctd\u003e0.1532  sec\u003c/td\u003e\n    \u003ctd\u003e12\u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\n![N100](https://github.com/graysky2/kernel_compiler_patch/blob/master/benchmark/boxplot3.svg)\n\n\u003ctable\u003e\n  \u003ctr\u003e\n    \u003cth\u003eTreatment pairs\u003c/th\u003e\n    \u003cth\u003eTukey HSD Q stat\u003c/th\u003e\n    \u003cth\u003eTukey HSD p-value\u003c/th\u003e\n    \u003cth\u003eTukey HSD interfence\u003c/th\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eGeneric x86-64 vs x86-64-v3\u003c/td\u003e\n    \u003ctd\u003e5.5076\u003c/td\u003e\n    \u003ctd\u003e0.0012818\u003c/td\u003e\n    \u003ctd\u003e$${\\color{green} \\verb|**|p\u003c0.01}$$\u003c/tr\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eGeneric x86-64 vs Intel Alder Lake\u003c/td\u003e\n    \u003ctd\u003e15.1600\u003c/td\u003e\n    \u003ctd\u003e0.0010053\u003c/td\u003e\n    \u003ctd\u003e$${\\color{green} \\verb|**|p\u003c0.01}$$\u003c/tr\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003ex86-64-v3 vs Intel Alder Lake\u003c/td\u003e\n    \u003ctd\u003e9.6524\u003c/td\u003e\n    \u003ctd\u003e0.0010053\u003c/td\u003e\n    \u003ctd\u003e$${\\color{green} \\verb|**|p\u003c0.01}$$\u003c/tr\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\n### Comparing GCC to Clang\nThe Ryzen 9 5950X was used to compare kernels built with GCC and Clang each with `Generic x86-64` and `x86-64-v3`.  The results are consistent for both compilers.\n\n\u003ctable\u003e\n  \u003ctr\u003e\n    \u003cth\u003eProcessor family option\u003c/th\u003e\n    \u003cth\u003eCompiler\u003c/th\u003e\n    \u003cth\u003eMean compile time\u003c/th\u003e\n    \u003cth\u003eStd dev\u003c/th\u003e\n    \u003cth\u003e# of replicates\u003c/th\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eGeneric x86-64\u003c/td\u003e\n    \u003ctd\u003eGCC\u003c/td\u003e\n    \u003ctd\u003e79.4569 sec\u003c/td\u003e\n    \u003ctd\u003e0.0664 sec\u003c/td\u003e\n   \u003ctd\u003e5\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003ex86-64-v3\u003c/td\u003e\n    \u003ctd\u003eGCC\u003c/td\u003e\n    \u003ctd\u003e79.1403 sec\u003c/td\u003e\n    \u003ctd\u003e0.0580 sec\u003c/td\u003e\n    \u003ctd\u003e5\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eGeneric x86-64\u003c/td\u003e\n    \u003ctd\u003eClang\u003c/td\u003e\n    \u003ctd\u003e79.8398 sec\u003c/td\u003e\n    \u003ctd\u003e0.0629 sec\u003c/td\u003e\n   \u003ctd\u003e5\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003ex86-64-v3\u003c/td\u003e\n    \u003ctd\u003eClang\u003c/td\u003e\n    \u003ctd\u003e79.0975 sec\u003c/td\u003e\n    \u003ctd\u003e0.0711 sec\u003c/td\u003e\n    \u003ctd\u003e5\u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\n![X9550](https://github.com/graysky2/kernel_compiler_patch/blob/master/benchmark/boxplot4.svg)\n\n### The stress-ng benchmarks\nHere, stress-ng microbenchmark improvements or regressions (or neutral changes) were as follows (average from 12 x 30 sec runs):\n```\naf-alg: +2.7% (kernel AL_ALG crypto)\nfork:     *   (process fork/exit)\nmmap:   +1.6% (memory mapping)\npipe:   +1.3% (pipe + context switch)\n\n*no statistically significant difference at p\u003c0.05\n```\n| units | benchmark | optimization | mean | std dev |\n|-|-|-|-|-|\n|bogo ops/s (real time)|af-alg|x86-64|104,320.21|168.61|\n|||x86-64-v3|107,154.54|127.73|\n||pipe|x86-64|1,535,225.4|3,624.5|\n|||x86-64-v3|1,555,824.2|4,212.6|\n||fork|x86-64|3,964.14|21.02|\n|||x86-64-v3|3,953.5|17.44|\n||mmap|x86-64|35.72|0.28|\n|||x86-64-v3|36.31|0.26|\n\n![af-alg](https://github.com/graysky2/kernel_compiler_patch/blob/master/benchmark/af-alg.svg)\n\n![fork](https://github.com/graysky2/kernel_compiler_patch/blob/master/benchmark/fork.svg)\n\n![mmap](https://github.com/graysky2/kernel_compiler_patch/blob/master/benchmark/mmap.svg)\n\n![pipe](https://github.com/graysky2/kernel_compiler_patch/blob/master/benchmark/pipe.svg)\n\n\n## Software versions used\n\nAll machines ran Arch Linux with the all stock repo packages with the exception of the kernel (see below).  At the time of work, the following the toolchain versions were used:\n* binutils 2.43+r4+g7999dae6961-1\n* clang 18.0.1-1\n* gcc 14.2.1+r134+gab884fffe3fc-1\n* gcc-libs 14.2.1+r134+gab884fffe3fc-1\n* glibc 2.40+r16+gaa533d58ff-2\n* linux-api-headers 6.10-1\n* stress-ng 0.18.04-1\n\nThe kernel packages were built on the official Arch Linux PKGBUILD for kernel version 6.10.10-arch1-1 applying the distro config differing only by the modifications introduced by the aforementioned patch from this repo.\n\nThe benchmark was compiling the vanilla Linux kernel version 6.10.10 and as mentioned above, the `.config` used was generated by running `make x86_64_defconfig`.\n\n## References\n* Script to run the benchmark: [make_bench.sh](https://github.com/graysky2/kernel_compiler_patch/blob/master/benchmark/make_bench.sh)\n* Data for three machines: [results.csv](https://github.com/graysky2/kernel_compiler_patch/blob/master/benchmark/results.csv)\n* Data for GCC vs Clang: [results2.csv](https://github.com/graysky2/kernel_compiler_patch/blob/master/benchmark/results2.csv)\n* Data for stress-ng tests: [stress-ng-data.csv](https://github.com/graysky2/kernel_compiler_patch/blob/master/benchmark/stress-ng-data.csv)\n\n\n\n## Credit\n* Original author: jeroen AT linuxforge DOT net\n* Link to original version: http://www.linuxforge.net/docs/linux/linux-gcc.php\n* Box plot generated with [statisty.app](https://statisty.app/anova-calculator)\n* ANOVA stats generated with [astatsa.com](https://astatsa.com/OneWay_Anova_with_TukeyHSD/)\n\n## Legacy support\nFind support for older version of the linux kernel and of gcc in the outdated_versions directory.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgraysky2%2Fkernel_compiler_patch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgraysky2%2Fkernel_compiler_patch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgraysky2%2Fkernel_compiler_patch/lists"}