https://github.com/byacherx/emlang
EMLang is a modern programming language designed for ease of use. It uses the LLVM backend and is written in C++. EMLang is an experimental language and should never be used in production.
https://github.com/byacherx/emlang
cpp language llvm llvm-ir
Last synced: about 2 months ago
JSON representation
EMLang is a modern programming language designed for ease of use. It uses the LLVM backend and is written in C++. EMLang is an experimental language and should never be used in production.
- Host: GitHub
- URL: https://github.com/byacherx/emlang
- Owner: ByaCherX
- License: mit
- Created: 2025-05-29T20:47:54.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-06-17T20:58:06.000Z (about 1 year ago)
- Last Synced: 2025-06-17T21:30:57.826Z (about 1 year ago)
- Topics: cpp, language, llvm, llvm-ir
- Language: C++
- Homepage:
- Size: 236 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README

# EMLang [β οΈ Active development canceled]
**A Modern Systems Programming Language with LLVM Backend**
[](#building)
[](https://llvm.org/)
[](LICENSE)
[](#prerequisites)
[](version)
EMLang is a statically-typed systems programming language designed for high-performance applications with ahead-of-time (AOT) compilation using LLVM. It combines modern language features with C-like performance and memory control.
> [!Warning]
> **This language is not designed for production use, it is created for testing and learning purposes only.**
> - β οΈ It is not recommended to use it in production systems.
> - π¬ It is a work in progress and may not be fully functional or stable.
---
## π Project Overview
EMLang provides a complete compiler pipeline with:
- **Advanced Type System**: Comprehensive primitive types with Unicode support
- **Memory Safety**: C-style pointers with semantic analysis
- **LLVM Integration**: Native code generation with optimization
- **Rich Standard Library**: I/O, string manipulation, math operations, and utilities
### π Project Structure
```
emlang/
βββ compiler/ # Complete compiler implementation
β βββ lexer/ # Lexical analyzer with Unicode support
β βββ parser/ # Recursive descent parser
β βββ ast/ # Abstract Syntax Tree with visitor pattern
β βββ semantic/ # Semantic analyzer with type checking
β βββ codegen/ # LLVM IR code generator
βββ include/ # Public header files
βββ library/ # Comprehensive standard library
β βββ include/ # Library headers
β βββ src/ # Library implementations
βββ src/ # Main compiler and analysis tools
βββ tests/ # Comprehensive test suite
βββ build/ # Build artifacts and executables
```
## β¨ Language Features
### **Primitive Types**
| Type | Size | Description | Min Value | Max Value |
|----------|--------|------------------|-----------|-----------|
| `int8` | 8-bit | Signed integer | -128 | 127 |
| `int16` | 16-bit | Signed integer | -32,768 | 32,767 |
| `int32` | 32-bit | Signed integer | -2,147,483,648 | 2,147,483,647 |
| `int64` | 64-bit | Signed integer | -9,223,372,036,854,775,808 | 9,223,372,036,854,775,807 |
| `uint8` | 8-bit | Unsigned integer | 0 | 255 |
| `uint16` | 16-bit | Unsigned integer | 0 | 65,535 |
| `uint32` | 32-bit | Unsigned integer | 0 | 4,294,967,295 |
| `uint64` | 64-bit | Unsigned integer | 0 | 18,446,744,073,709,551,615 |
| `float` | 32-bit | Floating point | ~-3.4e38 | ~3.4e38 |
| `double` | 64-bit | Double precision | ~-1.8e308 | ~1.8e308 |
| `bool` | 1-bit | Boolean | `false` | `true` |
| `char` | 8-bit | Character | 0 | 255 |
### **Unicode and String Support**
| Feature | Description | Examples |
|------------------------|-------------|----------|
| **Unicode Characters** | Full UTF-8 character support | `'π'`, `'Ο'`, `'β¬'` |
| **Escape Sequences** | Standard C-style escapes | `'\n'`, `'\t'`, `'\\'`, `'\"'` |
| **Unicode Escapes** | Unicode code point notation | `'\u{03C0}'` (Ο), `'\u{20AC}'` (β¬) |
| **String Literals** | UTF-8 string support | `"Hello, World!"` |
| **Path Strings** | Windows/Unix path support | `"C:\\Users\\Name"` |
| **Mixed Content** | Unicode in strings | `"CΓ f ManΓΌ β
β β₯"` |
### **C-Style Pointer System**
| Feature | Operator | Description | Usage |
|---------|----------|-------------|-------|
| **Address-of** | `&` | Gets memory address | `&variable` |
| **Dereference** | `*` | Accesses value at address | `*pointer` |
| **Pointer Declaration** | `*` | Declares pointer type | `int32*` |
| **Multi-level Pointers** | `**` | Pointer to pointer | `int32**` |
| **Null Pointer** | `null` | Null pointer value | `ptr = null` |
### **Function Declarations**
| Feature | Syntax | Description | Example |
|---------|--------|-------------|---------|
| **Function Definition** | `function name(): type` | Regular function | `function add(a: int32, b: int32): int32` |
| **External Functions** | `extern function` | External C functions | `extern function printf(format: str): int32` |
| **Void Functions** | `: void` | No return value | `function print(): void` |
| **Parameters** | `name: type` | Typed parameters | `(x: int32, y: float)` |
| **Return Statement** | `return value` | Function return | `return x + y` |
### **Control Flow**
| Structure | Syntax | Description | Features |
|-----------|--------|-------------|----------|
| **If Statement** | `if (condition) { }` | Conditional execution | With optional `else` |
| **While Loop** | `while (condition) { }` | Pre-condition loop | Condition checked first |
| **For Loop** | `for (init; condition; update)` | Counting loop | C-style syntax |
| **Block Scope** | `{ ... }` | Code blocks | Local variable scope |
| **Nested Structures** | - | All structures nestable | Unlimited nesting depth |
## ποΈ Architecture & Implementation
### β
**Lexical Analysis**
- **Unicode-aware tokenization** with full UTF-8 support
- **Escape sequence processing** for strings and characters
- **Multi-line and single-line comments**
- **Comprehensive operator recognition**
### β
**Syntax Analysis**
- **Recursive descent parser** with operator precedence
- **Complete AST node hierarchy** with visitor pattern
- **Error recovery** and synchronization
- **Lookahead disambiguation**
### β
**Semantic Analysis**
- **Hierarchical symbol tables** with scope management
- **Static type checking** with type inference
- **Pointer type validation** and safety analysis
- **Function signature verification**
### β
**Code Generation**
- **LLVM IR generation** for all language constructs
- **Type mapping** from EMLang to LLVM types
- **Memory management** with stack allocation
- **Optimization passes** integration
### π **Standard Library**
> [!Warning]
> The emlang standard library is not available at the moment. It will be available in beta.
The library provides essential functionality across multiple domains:
- **I/O Operations**: `emlang_print_*`, `emlang_read_*`, console control
- **String Manipulation**: `emlang_strlen`, `emlang_strcmp`, case conversion
- **Mathematical Functions**: `emlang_pow`, `emlang_sqrt`, trigonometry
- **Memory Management**: `emlang_malloc`, `emlang_free`, `emlang_memset`
- **Utility Functions**: Array operations, bit manipulation, hashing
## π§ Building & Installation
### Prerequisites
- **CMake** 3.10 or higher
- **C++17** compatible compiler (GCC 9+, Clang 8+, MSVC 2017+)
- **LLVM** 14+ (automatically detected, enables code generation)
- **Git** for cloning and version control
### ποΈ Build Process
#### Windows (Visual Studio)
```powershell
# Clone the repository
git clone https://github.com/ByaCherX/emlang.git
cd emlang
# Create build directory
mkdir build
cd build
# Configure with CMake
cmake .. -G "Visual Studio 17 2022" -A x64
# Build the project
cmake --build . --config Release
# Run tests
./Release/emlang_check.exe ../tests/simple_types_test.em
```
#### Linux/macOS
```bash
# Clone and build
git clone https://github.com/ByaCherX/emlang.git
cd emlang
# Create and enter build directory
mkdir build && cd build
# Configure and build
cmake .. -DCMAKE_BUILD_TYPE=Release
make -j$(nproc)
# Install (optional)
sudo make install
```
### π¦ Build Targets
- **`emlang`** - Main compiler executable
- **`emlang_check`** - AST and token analysis tool
- **`emlang_compiler`** - Compiler library (DLL/shared object)
- **`emlang_lib`** - Standard library (optional, requires LLVM)
## π Usage & Examples
### Basic Compilation
```bash
# Compile EMLang source to object file
./emlang source.em
# Generate LLVM IR
./emlang --emit-llvm source.em -o output.ll
# Analyze source code structure
./emlang_check --ast --tokens source.em
```
### π Language Examples
> [!Note]
> Pointers are experimental, define a CMAKE Flag to enable them.
#### Complete Program
```emlang
extern function printf(format: str): int32;
function factorial(n: int32): int32 {
if (n <= 1) {
return 1;
}
return n * factorial(n - 1);
}
function main(): int32 {
let number: int32 = 5;
let result: int32 = factorial(number);
printf("Factorial of %d is %d\n");
return 0;
}
```
#### Pointer Operations
```emlang
function swapIntegers(a: int32*, b: int32*): void {
let temp: int32 = *a;
*a = *b;
*b = temp;
}
function main(): int32 {
let x: int32 = 10;
let y: int32 = 20;
swapIntegers(&x, &y);
// x is now 20, y is now 10
return 0;
}
```
#### String and Character Processing
```emlang
function stringDemo(): void {
let greeting: str = "Hello, δΈη! π";
let firstChar: char = 'H';
let unicode: char = '\u{1F30D}'; // π
// String escape sequences
let formatted: str = "Line 1\nLine 2\t\"Quoted\"";
let path: str = "C:\\Program Files\\EMLang";
}
```
### π Analysis Tools
#### Token Analysis
```bash
$ ./emlang_check --tokens simple_test.em
=== TOKENS ===
FUNCTION function 1:1
IDENTIFIER main 1:10
LEFT_PAREN ( 1:14
RIGHT_PAREN ) 1:15
COLON : 1:16
INT32 int32 1:18
LEFT_BRACE { 1:24
...
```
#### AST Visualization
```md
$ ./emlang_check --ast simple_test.em
=== AST ===
Program
\-stmt0: ExternFunctionDecl name='emlang_print_int' return='void'
\-param0: value: int32
\-stmt1: ExternFunctionDecl name='emlang_print_str' return='void'
\-param0: str_param: char*
\-stmt2: ExternFunctionDecl name='emlang_println' return='void'
\-stmt3: FunctionDecl name='main' return='int32'
\-body: BlockStmt
\-stmt0: VarDecl let x type='int32'
\-init: LiteralExpr value='10'
\-stmt1: ExprStmt
\-expr: FunctionCallExpr name='emlang_print_str'
\-arg0: LiteralExpr value='Initial x value: '
\-stmt2: ExprStmt
\-expr: FunctionCallExpr name='emlang_print_int'
\-arg0: IdentifierExpr name='x'
\-stmt3: ExprStmt
\-expr: FunctionCallExpr name='emlang_println'
\-stmt4: ExprStmt
\-expr: AssignmentExpr
\-target: IdentifierExpr name='x'
\-value: LiteralExpr value='20'
\-stmt5: ExprStmt
\-expr: FunctionCallExpr name='emlang_print_str'
\-arg0: LiteralExpr value='After assignment x = 20: '
\-stmt6: ExprStmt
\-expr: FunctionCallExpr name='emlang_print_int'
\-arg0: IdentifierExpr name='x'
\-stmt7: ExprStmt
\-expr: FunctionCallExpr name='emlang_println'
\-stmt8: ReturnStmt
\-value: LiteralExpr value='0'
```
### π¨ Error Reporting
EMLang provides comprehensive error diagnostics:
```
Semantic Error at 7:15 - Type mismatch: cannot assign 'str' to 'int32'
Semantic Error at 12:8 - Variable 'undeclared' not found in current scope
Semantic Error at 18:23 - Function 'unknown' is not declared
Parse Error at 25:1 - Expected ';' after expression statement
```
## π§ͺ Testing & Validation
### Test Suite Coverage
- **β
Basic Types**: `simple_types_test.em`
- **β
Unicode Characters**: `char_string_test.em`, `advanced_char_test.em`
- **β
Pointer Operations**: `simple_pointer_test.em`, `multilevel_pointer_test.em`
- **β
External Functions**: `extern_function_test.em`, `simple_extern_test.em`
- **β
Integration**: `phase2_integration_test.em`, `phase3_pointer_test.em`
- **β
Error Detection**: `error_test.em`
### Running Tests
```bash
# Run all tests
cd tests
for test in *.em; do
echo "Testing $test"
../build/Debug/emlang_check "$test"
done
# Individual test analysis
./build/Debug/emlang_check --all tests/phase3_pointer_test.em
```
## π€ Contributing
We welcome contributions to EMLang! Here's how you can help:
### π **Bug Reports**
- Use GitHub Issues to report bugs
- Include minimal reproduction code
- Specify your operating system and compiler version
### β¨ **Feature Requests**
- Propose new language features through GitHub Issues
- Provide use cases and examples
- Consider implementation complexity
### π» **Code Contributions**
1. **Fork** the repository
2. **Create** a feature branch (`git checkout -b feature/amazing-feature`)
3. **Commit** your changes (`git commit -m 'Add amazing feature'`)
4. **Push** to the branch (`git push origin feature/amazing-feature`)
5. **Open** a Pull Request
### π **Development Guidelines**
- Follow the existing code style and conventions
- Add tests for new features
- Update documentation for API changes
- Ensure all tests pass before submitting
### π **Areas Needing Help**
- **Standard Library**: Expand mathematical and utility functions
- **Error Messages**: Improve error reporting and suggestions
- **Optimization**: LLVM optimization pass integration
- **Documentation**: API documentation and tutorials
- **Testing**: Additional test cases and edge cases
## π License
EMLang is released under the **MIT License**. See [LICENSE](LICENSE) file for details.
---
## π Links & Resources
- **π Issue Tracker**: [GitHub Issues](https://github.com/ByaCherX/emlang/issues)
- **π¬ Discussions**: [GitHub Discussions](https://github.com/ByaCherX/emlang/discussions)
---
**Built with β€οΈ using LLVM**
*EMLang - Where Performance Meets Productivity*
[β Star us on GitHub](https://github.com/ByaCherX/emlang) | [π΄ Fork the Project](https://github.com/ByaCherX/emlang/fork) | [π Report Issues](https://github.com/ByaCherX/emlang/issues)