Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/cgsamp/c-disassembly
Simple C program assembled on Debian and Macos, with deep dive into Assembly code
https://github.com/cgsamp/c-disassembly
assembly-language c debian decompiling docker gcc macos
Last synced: about 1 month ago
JSON representation
Simple C program assembled on Debian and Macos, with deep dive into Assembly code
- Host: GitHub
- URL: https://github.com/cgsamp/c-disassembly
- Owner: cgsamp
- Created: 2024-10-16T16:09:25.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2024-10-16T17:05:38.000Z (3 months ago)
- Last Synced: 2024-10-26T22:10:50.911Z (3 months ago)
- Topics: assembly-language, c, debian, decompiling, docker, gcc, macos
- Language: Assembly
- Homepage:
- Size: 25.4 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Diving into C and Assembly on Macos and Debian
I was curious about how C compiles into Assembly and wanted to do some basic investigation.
## The C Program
[This program](factorial.c) was compiled on macos with **gcc** as well as Debian via a docker container. *gcc* on macos wraps **clang** so there are some differences.
```
#includeint main() {
printf("Hello from Factorial!\n");
int number = 10;
int total = 1;
for (int i=0; i < number; i++) {
total = total * (i + 1);
}
printf("%d factorial is %d\n",number,total);
return 0;
}
```The program just uses a for loop to calculate 10 factorial and print it out.
## Macos
### Compiling
I can easily use
```
gcc factorial.c -o factorial_macos
```
to create an executable binary.### Executing
Simple as
```
➜ ./factorial_macos
Hello from Factorial!
10 factorial is 3628800
```### Assembling
I can also create the assembly with
```
gcc -S factorial.c factorial_macos.s
```
[factorial_macos.s](factorial_macos.s)### Deassembling
I can also create a listing file, sort of, with
```
objdump -d -h factorial_macos > factorial_macos.lst
```
[factorial_macos.lst](factorial_macos.lst)## Debian
**gcc** on Debian is not wrapped, and can produce a more interesting listing file. To do that, I wanted to use a docker container to operate on my files somewhat locally.
### Docker
I have docker desktop installed and started, so the docker daemon is running.
```
➜ docker --version
Docker version 20.10.10, build b485636
```I am using a container provided by gcc [https://hub.docker.com/_/gcc](https://hub.docker.com/_/gcc). I can enter bash within the container, mounting my project directory to `code` with
```
docker run -it -v /Users/csamp/projects/see:/code gcc:latest bash
```
or execute commands and exit with
```
docker run --rm -t -v /Users/csamp/projects/see:/code -w /code gcc:latest [command with arguments]
```
### CompilingI can easily use
```
docker run --rm -t -v /Users/csamp/projects/see:/code -w /code gcc:latest gcc factorial.c -o factorial_debian
```
to create an executable binary.### Executing
The structure of Debian executables is different than macos executables.
```
➜ docker run --rm -t -v /Users/csamp/projects/see:/code -w /code gcc:latest ./factorial_debian
Hello from Factorial!
10 factorial is 3628800
```### Assembling
Here I again use **gcc** to create the assembly, but this one does not wrap another tool.
```
docker run --rm -v /Users/csamp/projects/see:/code -w /code gcc:latest gcc -S factorial.c -o factorial_debian.s
```
[factorial_debian.s](factorial_debian.s)### Object file
While I could let **gcc** compile and link the .c file into an executable, like on macos, I can also take the manual step of using **as** to create the object file. This .o file is machine code that can be viewed with a hex editor.
```
docker run --rm -v /Users/csamp/projects/see:/code -w /code gcc:latest as -o factorial_debian.o factorial_debian.s
```### Linking
Linking takes the object file, combines it with other libraries on the target platform, such as the *printf* function, and makes the binary executable.
```
docker run --rm -v /Users/csamp/projects/see:/code -w /code gcc:latest ld -o factorial_debian factorial_debian.o /lib/x86_64-linux-gnu/libc.so.6 -dynamic-linker /lib64/ld-linux-x86-64.so.2
```### Deassembling / Listing
While I can use **objdump** to disassemble the binary, I can get a better listing out of gcc on Debian:
```
docker run --rm -t -v /Users/csamp/projects/see:/code -w /code gcc:latest gcc -g -Wa,-adhln -o factorial_debian factorial.c > factorial_debian.lst
```
[factorial_debian.lst](factorial_debian.lst)## The Assembly
Here I have removed some assembler directives and commented on each assembler instruction. See also [factorial_debian.s](factorial_debian.s) for this content.
```
.section .rodata # Program section for read-only data
.LC0: # Storing a null-terminated string at memory location .LC0
.string "Hello from Factorial!"
.LC1: # Storing a null-terminated string at memory location .LC1
.string "%d factorial is %d\n".text # Beginning the text of the program instructions
.globl _start # Declares the symbol _start as externally accessible
_start: # Sets _start to this stack location, which is the entrypoint
pushq %rbp # Push the quadword (64-bit, 8-byte) current stack base pointer onto the stack.
movq %rsp, %rbp # Move the current stack pointer into the base pointer register
subq $16, %rsp # Subtract 16 bytes from the current stack pointer to make room for two 8-byte variables
movl $.LC0, %edi # Put the memory location referenced by .LC0 into edi register
call puts # Call function puts, put-string, that puts the string pointed at by the edi register to stdoutmovl $10, -12(%rbp) # Put the value 10 into the memory location starting at 12 bytes below the stack base pointer
movl $1, -4(%rbp) # Put the value 1 into the memory location 4 bytes below the base stack pointer
movl $0, -8(%rbp) # Put the value 0 into the memory location 8 bytes below the base stack pointer
jmp .L2 # Move the instruction pointer to the memory location symbolized by .L2.L3: # Loop contents
movl -8(%rbp), %eax # Move the 4-byte long integer to at memory location 8 bytes below the stack pointer into register eax
leal 1(%eax), %edx # Uses the pointer arithmetic operator LEA, usually used to increment memory locations, to increment the value in eax. Memory locations are just integers this works.
movl -4(%rbp), %eax # Move the long value stored 4 bytes below the base stack pointer into the eax register
imull %edx, %eax # Multiplies the signed long value in edx by the signed long value in eax and stores the result in eax.
movl %eax, -4(%rbp) # Moves the value in eax to the location four bytes below the base pointer.
addl $1, -8(%rbp) # Add one to the long value 8 bytes below the base pointer/.L2: # Loop test
movl -8(%rbp), %eax # Moves the long value 8 bytes below the base pointer into eax
cmpl -12(%rbp), %eax # Compares long value 12 bytes below base pointer to the value in eax
jl .L3 # Jumps if less than; looks at the sign flag and overflow flag. Jumps to the loop contents.movl -4(%rbp), %edx # Puts the long 4 below base pointer into edx
movl -12(%rbp), %eax # Puts the long value 12 below base pointer into eax
movl %eax, %esi # Puts the value of eax into esi
movl $.LC1, %edi # Puts the memory location symbolized by .LC1 into edi
movl $0, %eax # Puts 0 into eax
call printf # Calls printf, which uses edi, edx and esi to do string substituion and print to stdoutmovl $60, %eax # Syscall number for exit (60 on Linux)
xorq %rdi, %rdi # Exit code 0
syscall # Invoke system call to exit```
## Wrap up
I plan to do this on my M1 Mac and see what the ARM assembly looks like!