Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/aftermath22/oxygen_a-compiled-mini-language
A custom mini language with its own custom compiler built from scratch in modern C++20.
https://github.com/aftermath22/oxygen_a-compiled-mini-language
arena-allocator ast cmake cpp20 custom-language error-handling generation mini-compiler nasm nasm-assembly oops-in-cpp parse-tree parsing tokenization wsl
Last synced: about 1 month ago
JSON representation
A custom mini language with its own custom compiler built from scratch in modern C++20.
- Host: GitHub
- URL: https://github.com/aftermath22/oxygen_a-compiled-mini-language
- Owner: aftermath22
- License: mit
- Created: 2024-12-24T10:36:44.000Z (about 1 month ago)
- Default Branch: master
- Last Pushed: 2024-12-30T08:16:37.000Z (about 1 month ago)
- Last Synced: 2024-12-30T09:31:12.164Z (about 1 month ago)
- Topics: arena-allocator, ast, cmake, cpp20, custom-language, error-handling, generation, mini-compiler, nasm, nasm-assembly, oops-in-cpp, parse-tree, parsing, tokenization, wsl
- Language: C++
- Homepage:
- Size: 1.29 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Oxygen Programming Language
Oxygen is a small, compiled programming language that targets x86_64 assembly. It features basic arithmetic operations, variables, control flow, and scoping rules. \
I have learned most of the things from a youtube tutorial by 'Pixeled' and this is the [link](https://www.youtube.com/watch?v=vcSijrRsrY0&list=PLUDlas_Zy_qC7c5tCgTMYq2idyyT241qs) to his playlist.## Table of Contents
- [Overview](#overview)
- [Features](#features)
- [Project Structure](#project-structure)
- [Building](#building)
- [Language Syntax](#language-syntax)
- [Implementation Details](#implementation-details)
- [Examples](#examples)## Overview
OXY is implemented in C++ and follows a traditional compiler pipeline:
1. Lexical Analysis (Tokenization)
2. Parsing
3. Code Generation (to x86_64 assembly)The compiler outputs assembly code which is then assembled and linked using NASM and LD respectively.
## Features
- Integer arithmetic (addition, subtraction, multiplication, division)
- Variable declarations and assignments
- Scoping rules with block-level scope
- Control flow statements (if, elif, else)
- Comments (both single-line and multi-line)
- Exit statements## Project Structure
The project consists of several key components:
- `main.cpp`: Entry point that orchestrates the compilation process
- `tokenization.hpp`: Handles lexical analysis
- `parser.hpp`: Implements recursive descent parsing
- `generation.hpp`: Generates x86_64 assembly code
- `arena.hpp`: Implements a simple memory arena for AST node allocation### Component Details
#### Arena Allocator
- Custom memory management system
- Provides efficient allocation for AST nodes
- Uses a simple bump allocator approach
- Supports move semantics but not copying#### Tokenizer
- Breaks source code into tokens
- Handles keywords, identifiers, numbers, and operators
- Supports both single-line (`//`) and multi-line (`/* */`) comments
- Tracks line numbers for error reporting#### Parser
- Implements recursive descent parsing
- Builds an Abstract Syntax Tree (AST)
- Handles operator precedence
- Provides detailed error messages
- Uses the arena allocator for memory management#### Code Generator
- Generates x86_64 assembly code
- Implements stack-based variable storage
- Handles proper scope management
- Generates efficient arithmetic operations
- Supports control flow structures## Building
Prerequisites:
- C++ compiler with C++20 support
- NASM assembler
- LD linkerTo build a program:
```bash
# Compile the program
$ ./oxy input.oxy# This generates:
# 1. out.asm (assembly output)
# 2. out.o (object file)
# 3. out (executable)
```## Language Syntax
### Variables
```
let x = 42;
x = x + 1;
```### Control Flow
```
if (condition) {
// code
} elif (other_condition) {
// code
} else {
// code
}
```### Arithmetic
```
let result = (5 + 3) * 2;
let quotient = 10 / 2;
```### Scoping
```
{
let x = 1;
{
let y = 2;
// x and y visible here
}
// only x visible here
}
```### Comments
```
// Single line comment/* Multi-line
comment */
```### Exit Statement
```
exit(code); // Exits with specified code
```## Implementation Details
### Memory Management
The project uses a custom arena allocator for efficient AST node allocation. The arena allocator:
- Pre-allocates a large block of memory (4MB by default)
- Provides fast allocation with minimal overhead
- Does not support individual deallocation
- Frees all memory at once when destroyed### AST Structure
The AST is built using variant types for nodes, allowing for:
- Type-safe node representation
- Easy visitor pattern implementation
- Efficient memory layout### Code Generation
Assembly generation follows these principles:
- Stack-based variable storage
- Proper alignment handling
- Efficient register usage
- Direct syscall usage for exit operations### Error Handling
The compiler provides:
- Line number information in error messages
- Specific error messages for missing tokens
- Clear identification of undefined variables
- Validation of redeclaration attempts## Examples
Basic arithmetic:
```
let x = 5;
let y = 10;
let z = x + y * 2;
exit(z);
```Control flow:
```
let x = 5;
if (x > 3) {
exit(1);
} elif (x < 0) {
exit(2);
} else {
exit(3);
}
```Nested scopes:
```
let x = 1;
{
let y = x + 1;
{
let z = y + 1;
exit(z);
}
}
```