Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/dor-sketch/assembler

Thorough study of compiler architectures and translating assembly code.
https://github.com/dor-sketch/assembler

code-generation file-output openuniversity preprocessing symbol-table-generation syntax-analysis

Last synced: about 1 month ago
JSON representation

Thorough study of compiler architectures and translating assembly code.

Awesome Lists containing this project

README

        

# ๐Ÿ›  Assembler for Custom Assembly Language

This project involved the development of an assembler for a specialized assembly language. Its primary aim was to convert human-readable assembly instructions into binary machine code, bridging the gap between high-level programming concepts and low-level execution on computers.

Implemented in `ANSI C`, this project demonstrates a strong understanding of foundational programming principles. It was part of the `20465 System Programming Laboratory` course at _The Open University of Israel_ studied during the 2021-2022 academic year, and achieved a grade of `98`.



---

**Table of Contents**

- [๐Ÿš€ Features](#-features)
- [๐Ÿค– Usage](#-usage)
- [โœ… Examples](#-examples)
- [๐Ÿงฉ Modules](#-modules)
- [๐Ÿ‘ฅ Contributing](#-contributing)
- [๐Ÿ“œ License](#-license)

---

## ๐Ÿš€ Features

- **Preprocessing** ๐Ÿงน: The assembler supports preprocessing tasks, including macro expansion and line numbering.

- **Syntax Checking** โœ…: The assembler ensures syntax accuracy, checking for valid opcodes and operands.

- **Symbol Table** ๐Ÿ“š: The assembler generates a symbol table, computing label memory addresses.

- **Machine Code Generation** ๐Ÿ’ป: The assembler produces the machine code and data images.

- **Output Files** ๐Ÿ“: The assembler prints output files such as the machine code file, external data words file, and entry type symbols file.

- **Error Handling** ๐Ÿšจ: The assembler handles various syntax and semantic errors, providing descriptive error messages, including line numbers and error types with clickable links to the relevant code.

- **Dynamic Memory Allocation** ๐Ÿง : The assembler uses dynamic memory allocation to manage memory efficiently.
- **Modular Design** ๐Ÿงฉ: The assembler is designed with a modular architecture, with each module responsible for a specific task.

- **Coding Standards** ๐Ÿ“: The assembler adheres to the project's coding standards, including naming conventions, indentation, and documentation.

- **Testing** ๐Ÿงช: The assembler is thoroughly tested, with a test suite that covers all possible scenarios, including `valgrind` memory leak checks with no errors.

---

## ๐Ÿค– Usage

### New GUI

| ![Alt text](./images/image.png) | ![Alt text](./images/image-2.png) |
| :-----------------------------: | :-------------------------------: |

The assembler now includes a new GUI, allowing users to assemble assembly code with a few clicks. The GUI is built with `Gtk+`. It's written in `c++` but integrates with the assembler's `c` codebase using `extern "C"`. This allows the assembler `main` function to get services from the GUI, such as the input file path and output directory path without having to change the assembler's codebase.

**Note:** The GUI is currently only tested on `Ubuntu 22.04` and `MacOS Sonoma` and consider a work in progress. For stable usage, please use the command line interface or prevoius version of the Assembler.

---

Before runing the assembler, make sure you have `gcc` and `Gtk+` installed on your machine.

You can install `Gtk+` on `MacOS` using `brew`:

```bash
brew install gtk+3
```

---

### Command Line

Use the assembler by providing an input file with assembly code. The output includes several files: a machine code file, an external data words file, and an entry type symbols file.

```bash
make
./assembler {input - without .as extension. e.g. input_example}
```

---

## โœ… Examples

### Successful Assembly Output

The screenshots below demonstrate the successful output files generated by the assembler from the [input_example.as](./images/input_example.as) file:

- Assembly Code Snippet (`ps.am`):

```assembly
; Assembly code that defines data, strings, and contains various instructions
; including 'add', 'prn', 'lea', 'inc', 'mov', 'sub', 'bne', 'cmp', 'dec', and 'stop'.
.entry LIST
.extern W
MAIN: add r3,LIST
LOOP: prn #48
macro m1 ; macro definition
inc r6
mov r3, W
endm
lea STR, r6
m1 ; macro call
sub r1, r4
bne END
cmp vall, #-6
bne END[r15]
dec K
.entry MAIN
sub LOOP[r10],r14
END: stop
STR: .string "abcd"
LIST: .data 6,-9
.data -100
.entry K
K: .data 31
.extern va
```

_note: the macro will be expanded in the preprocessor stage:_

```assembly
.entry LIST
.extern W
MAIN: add r3,LIST
LOOP: prn #48
lea STR, r6
inc r6
mov r3, W
sub r1, r4
bne END
cmp vall, #-6
bne END[r15
dec K
.entry MAIN
sub LOOP[r10],r14
END: stop
STR: .string "abcd"
LIST: .data 6,-9
.data -100
.entry K
K: .data 31
.extern vall
```

- Entry Symbol Table (`input_example.ent`):

```plaintext
; List of entry symbols and their addresses
K,0144,0005
LIST,0144,0002
MAIN,0096,0004
```

- External Symbol References (`input_example.ext`):

```plaintext
; External symbols and their references in the code
vall BASE 0125
vall OFFSET 0126
W BASE 0115
W OFFSET 0116
```

- Machine Code Output (`input_example.ob`):

```plaintext
; Binary representation of the assembly code
41 9
0100 A4-B0-C0-D0-E4
... (additional lines of machine code)
0149 A4-B0-C0-D1-Ef
```

---

### Error Handling

Below is a screenshot showing how the assembler handles various syntax and semantic errors from [errors_example.as](errors_example.as). Each error message is designed to be descriptive, guiding the user to identify and rectify the issues within the assembly code.

![image](https://github.com/Dor-sketch/openu_course20465_project/assets/138825033/d3178433-7907-40ff-b8d5-f2f500b07d0c)

The error messages include issues like undefined operations, missing operands, invalid target registers, and failures to find symbols for direct addressing mode, showcasing the assembler's comprehensive error-checking capabilities.

---

## ๐Ÿงฉ Modules

The assembler includes several modules:

- `๐Ÿ“ pre.c`: Manages preprocessing tasks, including macro expansion and line numbering.
- `๐Ÿ”Ž syntax.c`: Ensures syntax accuracy, checking for valid opcodes and operands.
- `๐Ÿšฆ first_pass.c`: Conducts the first assembly pass, generating a symbol table and computing label memory addresses.
- `๐Ÿš€ second_pass.c`: Performs the second pass, producing the machine code and data images.
- `๐Ÿ–จ๏ธ print_output.c`: Prints output files such as the machine code file, external data words file, and entry type symbols file.
- `๐Ÿ main.c`: Coordinates the other modules to produce the final output.

## ๐Ÿ‘ฅ Contributing

Contributors are welcome! Fork the repository and submit a pull request with your changes. Please ensure your contributions are well-tested and adhere to the project's coding standards.

## ๐Ÿ“œ License

This project is licensed under the MIT License.