https://github.com/meithecatte/mips-quine
MIPS Quine that disassembles its own code and parses its own ELF symbol table
https://github.com/meithecatte/mips-quine
Last synced: 11 months ago
JSON representation
MIPS Quine that disassembles its own code and parses its own ELF symbol table
- Host: GitHub
- URL: https://github.com/meithecatte/mips-quine
- Owner: meithecatte
- License: mit
- Created: 2023-06-08T22:17:48.000Z (over 2 years ago)
- Default Branch: master
- Last Pushed: 2023-06-08T22:20:09.000Z (over 2 years ago)
- Last Synced: 2025-01-30T19:49:13.172Z (about 1 year ago)
- Language: Assembly
- Size: 15.6 KB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# MIPS quine
A quine is a program that prints its own source code. This one accomplishes this
by disassembling itself.
```
$ mips-linux-gnu-gcc -static -fno-pic -mno-abicalls -nostdlib quine.S -o quine
$ qemu-mips ./quine | head
.global __start
.set noreorder
__start:
la $a1, s0
jal zstr
addu $0, $0, $0
la $t0, __executable_start
lw $t1, 32($t0)
$ diff <(qemu-mips ./quine) quine.S
$
```
## why?
It [seemed more enticing than going to sleep][nerdsnipe].
## how do you handle labels?
It parses its own ELF headers and walks its own symbol table.
## what's with the cryptic label names?
If the binary goes over 4 KiB, the symbol table lands in a part of the file that
doesn't get mapped. The easiest way to work around this was to golf it down.
It even was a fun challenge to make it fit. I could probably open the executable
and map it into memory, but that seems really finicky.
## Trivia
- The code assumes that a `lui` instruction is always part of the assembler's
expansion of `la` and blindly reads out the literal field of the following
`addiu` without checking whether it's actually an `addiu`. This is fine,
as we only need to handle the situations that occur in our code.
- The disassembler is table driven, with each entry in the table consisting
of a single byte, followed by the opcode name as a null-terminated string.
The single byte holds 6 bits to identify the opcode, and 2 bits to describe
how the operands should get decoded.
This became a problem when the `jr` instruction called for a 5th operand mode
in the R-encoding table. Instead of handling this properly, I just made
the instruction always decode as `jr $ra`, by having `"jr $ra"` be the name
of the opcode. It's not like anything else occurs in the quine's code.
- To avoid writing a separate routine for printing out the tables that drive
the disassembler, I encode the opcode bytes using octal escapes. It's hideous,
but it gets the job done.
- MIPS register names are mere suggestions. "Caller-saved registers"? That sounds
like a global variable to me. "Function arguments are passed through `$a0-$a3`"?
Sometimes you just don't feel like it and use `$t0`. Sometimes the only argument
goes through `$a1`, skipping `$a0` entirely. I plead code golf.
## Closing remarks
meow!! ^_^ :3
[nerdsnipe]: https://xkcd.com/356/