https://github.com/ariful305/c-to-assembly-code-generator-compiler-project
A tiny C→pseudo-assembly compiler: Flex-based lexer + recursive-descent parser that emits a simple MOV/ADD/SUB/MUL/DIV IR and runs peephole optimizations (identity removal, constant folding, copy-prop, NOP compaction). Includes sample input.c and Windows binary. MIT licensed.
https://github.com/ariful305/c-to-assembly-code-generator-compiler-project
assembly c c-to-assembly compiler-design project
Last synced: 2 months ago
JSON representation
A tiny C→pseudo-assembly compiler: Flex-based lexer + recursive-descent parser that emits a simple MOV/ADD/SUB/MUL/DIV IR and runs peephole optimizations (identity removal, constant folding, copy-prop, NOP compaction). Includes sample input.c and Windows binary. MIT licensed.
- Host: GitHub
- URL: https://github.com/ariful305/c-to-assembly-code-generator-compiler-project
- Owner: ariful305
- License: mit
- Created: 2025-08-18T16:19:33.000Z (10 months ago)
- Default Branch: main
- Last Pushed: 2025-08-18T16:29:29.000Z (10 months ago)
- Last Synced: 2025-08-18T18:26:49.759Z (10 months ago)
- Topics: assembly, c, c-to-assembly, compiler-design, project
- Language: C
- Homepage:
- Size: 60.5 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# C-to-Assembly Code Generator (Toy Compiler)
A tiny educational compiler that reads a **C-like subset** from `input.c`, parses expressions/statements, generates a **simple pseudo-assembly IR** (instructions like `MOV/ADD/SUB/MUL/DIV`), runs a few **peephole optimizations**, and writes two outputs:
- `output_before.c` – raw IR (with `; NOP` lines)
- `output_after.c` – optimized IR (NOPs removed)
> This project is great for learning **lexing**, **recursive-descent parsing**, and **local code optimizations**.
---
## ✨ Features
- **Lexer (Flex)**: tokenizes keywords, identifiers, numbers, operators.
- **Parser (recursive-descent)**:
- Expressions with `+ - * /` and parentheses
- Simple relational forms parsed (placeholders only)
- Statements: `int` declarations, assignments, `if/else`, `for`, `return`, `{ ... }` blocks
- **IR emission**: instructions `MOV/ADD/SUB/MUL/DIV` + `OP_TEXT` comments
- **Optimizations (peephole)**
1. Identity cleanup (e.g., `x+0`, `x*1`, `x/1` → NOP; `x*0` → `MOV x,0`)
2. Local constant folding (`MOV d,C; ADD d,k; …` → `MOV d, C'`)
3. Adjacent copy-prop (`MOV t,X; MOV A,t` → `MOV A,X`)
4. NOP compaction (physically removes NOPs)
---
## 📁 Repository layout
The repo includes these files (highlights):
- `main.c` – parser, IR builder, optimizer, and driver
- `lexer.l` – Flex lexer rules
- `lex.yy.c` – generated lexer C file (committed for convenience)
- `input.c` – sample input program
- `output_before.c`, `output_after.c`, `output.c` – example outputs/artifacts
- `cmini.exe` – prebuilt Windows binary
- `.vscode/`, `.gitattributes`, `LICENSE` (MIT)
_Source: repo file list & license block on GitHub._ :contentReference[oaicite:1]{index=1}
---
## 🛠️ Build
### Prereqs
- A C compiler (GCC/Clang or MSVC)
- **Flex** (only needed if you want to regenerate `lex.yy.c`; otherwise the committed `lex.yy.c` is enough)
### Run code
```bash
Flex lexer.l
gcc -O2 -o cmini.exe main.c lex.yy.c
.\cmini.exe