https://github.com/dor-sketch/assembler
Thorough study of compiler architectures and translating assembly code.
https://github.com/dor-sketch/assembler
code-generation file-output openuniversity preprocessing symbol-table-generation syntax-analysis
Last synced: about 1 year ago
JSON representation
Thorough study of compiler architectures and translating assembly code.
- Host: GitHub
- URL: https://github.com/dor-sketch/assembler
- Owner: Dor-sketch
- Created: 2023-09-27T23:50:44.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2024-02-01T21:32:25.000Z (about 2 years ago)
- Last Synced: 2025-01-22T10:23:02.865Z (about 1 year ago)
- Topics: code-generation, file-output, openuniversity, preprocessing, symbol-table-generation, syntax-analysis
- Language: C
- Homepage:
- Size: 2.45 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# ๐ Assembler for Custom Assembly Language
This project involved the development of an assembler for a specialized assembly language. Its primary aim was to convert human-readable assembly instructions into binary machine code, bridging the gap between high-level programming concepts and low-level execution on computers.
Implemented in `ANSI C`, this project demonstrates a strong understanding of foundational programming principles. It was part of the `20465 System Programming Laboratory` course at _The Open University of Israel_ studied during the 2021-2022 academic year, and achieved a grade of `98`.
---
**Table of Contents**
- [๐ Features](#-features)
- [๐ค Usage](#-usage)
- [โ
Examples](#-examples)
- [๐งฉ Modules](#-modules)
- [๐ฅ Contributing](#-contributing)
- [๐ License](#-license)
---
## ๐ Features
- **Preprocessing** ๐งน: The assembler supports preprocessing tasks, including macro expansion and line numbering.
- **Syntax Checking** โ
: The assembler ensures syntax accuracy, checking for valid opcodes and operands.
- **Symbol Table** ๐: The assembler generates a symbol table, computing label memory addresses.
- **Machine Code Generation** ๐ป: The assembler produces the machine code and data images.
- **Output Files** ๐: The assembler prints output files such as the machine code file, external data words file, and entry type symbols file.
- **Error Handling** ๐จ: The assembler handles various syntax and semantic errors, providing descriptive error messages, including line numbers and error types with clickable links to the relevant code.
- **Dynamic Memory Allocation** ๐ง : The assembler uses dynamic memory allocation to manage memory efficiently.
- **Modular Design** ๐งฉ: The assembler is designed with a modular architecture, with each module responsible for a specific task.
- **Coding Standards** ๐: The assembler adheres to the project's coding standards, including naming conventions, indentation, and documentation.
- **Testing** ๐งช: The assembler is thoroughly tested, with a test suite that covers all possible scenarios, including `valgrind` memory leak checks with no errors.
---
## ๐ค Usage
### New GUI
|  |  |
| :-----------------------------: | :-------------------------------: |
The assembler now includes a new GUI, allowing users to assemble assembly code with a few clicks. The GUI is built with `Gtk+`. It's written in `c++` but integrates with the assembler's `c` codebase using `extern "C"`. This allows the assembler `main` function to get services from the GUI, such as the input file path and output directory path without having to change the assembler's codebase.
**Note:** The GUI is currently only tested on `Ubuntu 22.04` and `MacOS Sonoma` and consider a work in progress. For stable usage, please use the command line interface or prevoius version of the Assembler.
---
Before runing the assembler, make sure you have `gcc` and `Gtk+` installed on your machine.
You can install `Gtk+` on `MacOS` using `brew`:
```bash
brew install gtk+3
```
---
### Command Line
Use the assembler by providing an input file with assembly code. The output includes several files: a machine code file, an external data words file, and an entry type symbols file.
```bash
make
./assembler {input - without .as extension. e.g. input_example}
```
---
## โ
Examples
### Successful Assembly Output
The screenshots below demonstrate the successful output files generated by the assembler from the [input_example.as](./images/input_example.as) file:
- Assembly Code Snippet (`ps.am`):
```assembly
; Assembly code that defines data, strings, and contains various instructions
; including 'add', 'prn', 'lea', 'inc', 'mov', 'sub', 'bne', 'cmp', 'dec', and 'stop'.
.entry LIST
.extern W
MAIN: add r3,LIST
LOOP: prn #48
macro m1 ; macro definition
inc r6
mov r3, W
endm
lea STR, r6
m1 ; macro call
sub r1, r4
bne END
cmp vall, #-6
bne END[r15]
dec K
.entry MAIN
sub LOOP[r10],r14
END: stop
STR: .string "abcd"
LIST: .data 6,-9
.data -100
.entry K
K: .data 31
.extern va
```
_note: the macro will be expanded in the preprocessor stage:_
```assembly
.entry LIST
.extern W
MAIN: add r3,LIST
LOOP: prn #48
lea STR, r6
inc r6
mov r3, W
sub r1, r4
bne END
cmp vall, #-6
bne END[r15
dec K
.entry MAIN
sub LOOP[r10],r14
END: stop
STR: .string "abcd"
LIST: .data 6,-9
.data -100
.entry K
K: .data 31
.extern vall
```
- Entry Symbol Table (`input_example.ent`):
```plaintext
; List of entry symbols and their addresses
K,0144,0005
LIST,0144,0002
MAIN,0096,0004
```
- External Symbol References (`input_example.ext`):
```plaintext
; External symbols and their references in the code
vall BASE 0125
vall OFFSET 0126
W BASE 0115
W OFFSET 0116
```
- Machine Code Output (`input_example.ob`):
```plaintext
; Binary representation of the assembly code
41 9
0100 A4-B0-C0-D0-E4
... (additional lines of machine code)
0149 A4-B0-C0-D1-Ef
```
---
### Error Handling
Below is a screenshot showing how the assembler handles various syntax and semantic errors from [errors_example.as](errors_example.as). Each error message is designed to be descriptive, guiding the user to identify and rectify the issues within the assembly code.

The error messages include issues like undefined operations, missing operands, invalid target registers, and failures to find symbols for direct addressing mode, showcasing the assembler's comprehensive error-checking capabilities.
---
## ๐งฉ Modules
The assembler includes several modules:
- `๐ pre.c`: Manages preprocessing tasks, including macro expansion and line numbering.
- `๐ syntax.c`: Ensures syntax accuracy, checking for valid opcodes and operands.
- `๐ฆ first_pass.c`: Conducts the first assembly pass, generating a symbol table and computing label memory addresses.
- `๐ second_pass.c`: Performs the second pass, producing the machine code and data images.
- `๐จ๏ธ print_output.c`: Prints output files such as the machine code file, external data words file, and entry type symbols file.
- `๐ main.c`: Coordinates the other modules to produce the final output.
## ๐ฅ Contributing
Contributors are welcome! Fork the repository and submit a pull request with your changes. Please ensure your contributions are well-tested and adhere to the project's coding standards.
## ๐ License
This project is licensed under the MIT License.