{"id":27540063,"url":"https://github.com/leocelente/compiler-exploring","last_synced_at":"2026-04-27T20:31:49.302Z","repository":{"id":80524188,"uuid":"238126204","full_name":"leocelente/compiler-exploring","owner":"leocelente","description":"Basic usage of the Compiler Explorer tool to understand a quirk of simple undefined behavior in GCC ","archived":false,"fork":false,"pushed_at":"2020-02-04T04:56:23.000Z","size":5,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-04-19T08:27:24.983Z","etag":null,"topics":["assembly","c","compiler-explorer","stack","undefined-behavior"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/leocelente.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2020-02-04T04:54:23.000Z","updated_at":"2021-05-23T23:54:48.000Z","dependencies_parsed_at":"2023-03-02T03:01:01.712Z","dependency_job_id":null,"html_url":"https://github.com/leocelente/compiler-exploring","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/leocelente/compiler-exploring","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/leocelente%2Fcompiler-exploring","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/leocelente%2Fcompiler-exploring/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/leocelente%2Fcompiler-exploring/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/leocelente%2Fcompiler-exploring/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/leocelente","download_url":"https://codeload.github.com/leocelente/compiler-exploring/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/leocelente%2Fcompiler-exploring/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32354566,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-27T20:07:02.737Z","status":"ssl_error","status_checked_at":"2026-04-27T20:07:00.910Z","response_time":128,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["assembly","c","compiler-explorer","stack","undefined-behavior"],"created_at":"2025-04-18T22:27:30.046Z","updated_at":"2026-04-27T20:31:49.288Z","avatar_url":"https://github.com/leocelente.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Compiler Exploring\n## Introduction\nThis is probably one of the most basic compiler explorer investigation that can be made. It started when I was browsing through the [Embedded Artistry](https://embeddedartistry.com)'s resources [repo](https://github.com/embeddedartistry/embedded-resources/) and found a file called `bad.c`.\n\n## The Setup\nIn that file there was a simple example of undefined behavior of using unitialized variables.\n```c\nvoid foo(void)\n{\n\tint a = 5;\n\tint b;\n}\n\nvoid bar(void)\n{\n\tint x;\n\tprintf(\"%d\\n\", x++);\n}\n\nint main(void)\n{\n\tfoo();\n\tbar();\n\tbar();\n\n\treturn 0;\n}\n```\n\nLooking at the file you can see that when we call the `bar` function the stack will be populated by the data stored by the `foo` procedure, and the output should look something like this:\n```shell\n$ gcc bad.original.c; ./a.out\n5\n6\n```\n## The Problem \nAs I was playing around with the file I wrote down a copy of the `bar` function with the difference of adding another variable (`y = 8`) and not incrementing them.\n```c\nvoid bar2() {\n  int x;\n  int y;\n  printf(\"%d %d\\n\", x, y);\n}\n``` \n\nI was surprised to see that the output changed slightly:\n```shell\n$ gcc bad.c; ./a.out         \n9 6\n10 7\n7 10\n```\nWhy when we printed from `bar2` we get a reversed order? That made me somewhat curious and the fact that when we compile the same program with the clang compiler we get the \"expected\" out put. So I opened up compiler explorer and looked at some of the assembly output in [Compiler Explorer](https://godbolt.org/z/PpM3Lw):\n```asm\nfoo:\n        push    rbp\n        mov     rbp, rsp\n        mov     DWORD PTR [rbp-4], 5\n        mov     DWORD PTR [rbp-8], 8\n        nop\n        pop     rbp\n        ret\n.LC0:\n        .string \"%d %d\\n\"\nbar:\n        push    rbp\n        mov     rbp, rsp\n        sub     rsp, 16\n        add     DWORD PTR [rbp-4], 1\n        add     DWORD PTR [rbp-8], 1\n        mov     edx, DWORD PTR [rbp-4]\n        mov     eax, DWORD PTR [rbp-8]\n        mov     esi, eax\n        mov     edi, OFFSET FLAT:.LC0\n        mov     eax, 0\n        call    printf\n        nop\n        leave\n        ret\nbar2:\n        push    rbp\n        mov     rbp, rsp\n        sub     rsp, 16\n        mov     edx, DWORD PTR [rbp-8]\n        mov     eax, DWORD PTR [rbp-4]\n        mov     esi, eax\n        mov     edi, OFFSET FLAT:.LC0\n        mov     eax, 0\n        call    printf\n        nop\n        leave\n        ret\nmain:\n        push    rbp\n        mov     rbp, rsp\n        call    foo\n        call    bar\n        call    bar\n        mov     eax, 0\n        call    bar2\n        mov     eax, 0\n        pop     rbp\n        ret\n\n```\nFirst step is to look at the `call printf`. Knowing that the 64-bit calling Linux convention passes arguments through regiters in the following order:  `rdi, rsi, rdx, r10, r9, r8`.  \n\nSo we see that the main difference between `bar` and `bar2` is the fact that `[rbp-8]` comes before `[rbp-4]`, while the order of the registers is manteined. Indicating that in the second function the stack is read in reverse order. And when we look at the clang output we see that the order is preserved. From that we conclude that simply incrementing the variables instead of immediately passing them as arguments causes the gcc compiler to drop the \"normal\" ordering when reading the stack. \n\n## Conclusion\nIt is really interesting that this supposedly inconsequential change can cause such behaviors in the code. As a bonus quirk of gcc we can see that when we call `bar2` we clear the `eax` register. But when we just call `bar` three times we get no clearing. While clang does the clearing independenly of the order.\n\nClang output in [Compiler Explorer](https://godbolt.org/z/S-8jNP):\n```asm\nfoo:                                    # @foo\n        push    rbp\n        mov     rbp, rsp\n        mov     dword ptr [rbp - 4], 5\n        mov     dword ptr [rbp - 8], 8\n        pop     rbp\n        ret\nbar:                                    # @bar\n        push    rbp\n        mov     rbp, rsp\n        sub     rsp, 16\n        mov     eax, dword ptr [rbp - 4]\n        add     eax, 1\n        mov     dword ptr [rbp - 4], eax\n        mov     ecx, dword ptr [rbp - 8]\n        add     ecx, 1\n        mov     dword ptr [rbp - 8], ecx\n        movabs  rdi, offset .L.str\n        mov     esi, eax\n        mov     edx, ecx\n        mov     al, 0\n        call    printf\n        add     rsp, 16\n        pop     rbp\n        ret\nbar2:                                   # @bar2\n        push    rbp\n        mov     rbp, rsp\n        sub     rsp, 16\n        mov     esi, dword ptr [rbp - 4]\n        mov     edx, dword ptr [rbp - 8]\n        movabs  rdi, offset .L.str\n        mov     al, 0\n        call    printf\n        add     rsp, 16\n        pop     rbp\n        ret\nmain:                                   # @main\n        push    rbp\n        mov     rbp, rsp\n        sub     rsp, 16\n        mov     dword ptr [rbp - 4], 0\n        call    foo\n        call    bar\n        call    bar\n        call    bar2\n        xor     eax, eax\n        add     rsp, 16\n        pop     rbp\n        ret\n.L.str:\n        .asciz  \"%d %d\\n\"\n\n```\n\nLet me end by reminding everyone that the use of uninitalized variables is clear undefined behavior and the compiler may do with that whatever it thinks is correct. Such clear violation of basic clean code conventions should easily be reported by a code sanitizer such as cppcheck. And note that simply turning on optimizations with `-O1` removed all the behavior as the `foo` function was optimized away as it produced no output.   ","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fleocelente%2Fcompiler-exploring","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fleocelente%2Fcompiler-exploring","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fleocelente%2Fcompiler-exploring/lists"}