{"id":22264331,"url":"https://github.com/0xpantera/halcyon","last_synced_at":"2025-07-03T22:05:11.346Z","repository":{"id":261624166,"uuid":"884201528","full_name":"0xpantera/halcyon","owner":"0xpantera","description":"Compiler for a subset of C written in Haskell","archived":false,"fork":false,"pushed_at":"2024-12-10T16:59:30.000Z","size":122,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-07-03T22:04:22.558Z","etag":null,"topics":["c","compilers","haskell","programming-languages"],"latest_commit_sha":null,"homepage":"","language":"Haskell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/0xpantera.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-11-06T10:32:37.000Z","updated_at":"2024-12-03T11:54:21.000Z","dependencies_parsed_at":"2025-06-26T02:05:37.838Z","dependency_job_id":"60df4c6c-d260-4a6f-814b-643bc848fb4f","html_url":"https://github.com/0xpantera/halcyon","commit_stats":null,"previous_names":["0xpantera/halcyon"],"tags_count":5,"template":false,"template_full_name":null,"purl":"pkg:github/0xpantera/halcyon","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/0xpantera%2Fhalcyon","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/0xpantera%2Fhalcyon/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/0xpantera%2Fhalcyon/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/0xpantera%2Fhalcyon/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/0xpantera","download_url":"https://codeload.github.com/0xpantera/halcyon/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/0xpantera%2Fhalcyon/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":263410760,"owners_count":23462296,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["c","compilers","haskell","programming-languages"],"created_at":"2024-12-03T10:08:32.858Z","updated_at":"2025-07-03T22:05:11.271Z","avatar_url":"https://github.com/0xpantera.png","language":"Haskell","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Halcyon: A C Compiler in Haskell\n\nHalcyon is a work-in-progress compiler for a large subset of C, written in Haskell. It targets the x86_64 instruction set architecture. This project focuses on implementing the core compiler functionality while leveraging existing system tools for preprocessing, assembly, and linking.\n\n## Current Status\n\nThe compiler currently handles C programs with unary operators and integer constants. For example:\n\n```c\nint main(void) {\n    return -((42 * 10) / (2 + 3));  // Returns -84\n}\n```\n\n### Compilation Pipeline\n\nThe compiler is organized into several major subsystems:\n\n- **Frontend** - Parsing and analysis\n  - Lexical analysis (breaks source into tokens)\n  - Parsing (converts tokens to AST)\n  - Token definitions\n\n- **Core** - Core data types and compiler infrastructure\n  - AST, Assembly, and TACKY intermediate representations\n  - Compiler monad and error handling\n\n- **Backend** - Code generation\n  - TACKY to assembly conversion\n  - Register allocation\n  - Assembly output\n\n- **Driver** - Pipeline coordination\n  - Command line interface\n  - Compilation pipeline stages\n  - External tool integration (GCC for preprocessing and linking)\n\nEach subsystem is organized as a hierarchical module that provides a clean interface to its functionality while hiding implementation details.\n\n### Internal Representations\n\nPrograms are represented internally using a series of increasingly lower-level data structures:\n\n1. **Abstract Syntax Tree (AST)**:\n  ```haskell\n  data Program = Program Function\n  data Function = Function Text Statement\n  data Statement = Return Expr\n  data Expr\n    = Constant Int\n    | Unary UnaryOp Expr\n    | Binary BinaryOp Expr Expr\n  data UnaryOp = Complement | Negate\n  data BinaryOp = Add | Subtract | Multiply | Divide | Remainder\n  ```\n\n2. **TACKY IR**:\n  ```haskell\n  data Program = Program Function\n  data Function = Function Text [Instruction]\n  data Instruction\n    = Return Val\n    | Unary UnaryOp Val Val\n    | Binary BinaryOp Val Val Val\n  data Val = Constant Int | Var Text\n  data UnaryOp = Complement | Negate\n  data BinaryOp = Add | Subtract | Multiply | Divide | Remainder\n  ```\n\n3. **Assembly AST**:\n  ```haskell\n  data Program = Program Function\n  data Function = Function Text [Instruction]\n  data Instruction\n    = Mov Operand Operand\n    | Unary UnaryOp Operand\n    | Binary BinaryOp Operand Operand\n    | Idiv Operand\n    | Cdq\n    | AllocateStack Int\n    | Ret\n  data Operand \n    = Imm Int \n    | Register Reg\n    | Pseudo Text\n    | Stack Int\n  data UnaryOp = Neg | Not\n  data BinaryOp = Add | Sub | Mult\n  data Reg = Ax | DX | R10 | R11\n  ```\n\n\n## Project Structure\n\n  ```\n  .\n  ├── app/                           # Application entry point\n  │   └── Main.hs\n  ├── bin/                           # Binary outputs\n  ├── lib/                           # Main library code\n  │   ├── Halcyon.hs                 # Library entry point\n  │   └── Halcyon/                   # Core modules\n  │       ├── Backend.hs             # Backend subsystem interface\n  │       ├── Backend/               # Code generation and emission\n  │       │   ├── Codegen.hs         # TACKY to Assembly conversion\n  │       │   ├── Emit.hs            # Assembly to text output\n  │       │   └── ReplacePseudos.hs  # Register/stack allocation\n  │       ├── Core.hs                # Core subsystem interface\n  │       ├── Core/                  # Core data types and utilities\n  │       │   ├── Assembly.hs        # Assembly representation\n  │       │   ├── Ast.hs             # C language AST\n  │       │   ├── Monad.hs           # Compiler monad stack\n  │       │   ├── Settings.hs        # Compiler settings and types\n  │       │   ├── Tacky.hs           # TACKY IR definition\n  │       │   └── TackyGen.hs        # AST to TACKY transformation\n  │       ├── Driver.hs              # Driver subsystem interface\n  │       ├── Driver/                # Compiler driver\n  │       │   ├── Cli.hs             # Command line interface\n  │       │   └── Pipeline.hs        # Compilation pipeline\n  │       ├── Frontend.hs          # Frontend subsystem interface\n  │       └── Frontend/              # Parsing and analysis\n  │           ├── Lexer.hs           # Lexical analysis\n  │           ├── Parse.hs           # Parsing\n  │           └── Tokens.hs          # Token definitions\n  ├── test/                          # Test suite\n  │   ├── Main.hs\n  │   └── Test/\n  │       ├── Lexer.hs\n  │       ├── Parser.hs\n  │       ├── Tacky.hs\n  │       ├── Assembly.hs\n  │       ├── Pipeline.hs\n  │       └── Common.hs\n  ├── CHANGELOG.md                   # Version history\n  ├── LICENSE                        # Project license\n  ├── README.md                      # Project documentation\n  ├── flake.nix                      # Nix build configuration\n  └── halcyon.cabal                  # Cabal build configuration\n  ```\n\n### Architecture\n\nThe compiler uses a monad transformer stack to handle IO operations and error management:\n\n```haskell\nnewtype CompilerT m a = CompilerT \n  { unCompilerT :: ExceptT CompilerError m a }\n\ntype Compiler = CompilerT IO\n```\n\nThis provides:\n- Error handling through `ExceptT`\n- IO capabilities through the underlying monad\n- Clean separation of pure and effectful code\n- Structured error reporting and recovery\n\n## Command Line Interface\n\n```bash\nhalcyon [OPTIONS] FILE\n\nOptions:\n  --lex                 Run lexical analysis only\n  --parse               Run parsing only\n  --codegen             Run through code generation\n  --tacky               Run through TACKY generation\n  -S                    Stop after assembly generation\n  -h,--help             Show help text\n```\n\n### Build and Run\n\n```bash\n# Build the project\ncabal build\n\n# Run the compiler\ncabal run halcyon -- [OPTIONS] input.c\n\n# Example: Compile a file\ncabal run halcyon -- input.c\n\n# Example: Run only the lexer\ncabal run halcyon -- --lex input.c\n```\n\n## Testing\n\nHalcyon uses Hspec and Tasty for its test suite. The tests cover all stages of compilation:\n\n```bash\n# Run all tests\ncabal test\n\n# Run tests with output\ncabal test --test-show-details=direct\n\n# Run a specific test module\ncabal test --test-pattern \"Lexer\"\n```\n\nThe test suite includes:\n\n- Unit tests for each compiler stage\n- Integration tests for the full pipeline\n- Helper utilities for building test cases\n\nTests are organized by compiler stage in `test/Test/`:\n\n- `Lexer.hs`: Token generation\n- `Parser.hs`: AST construction\n- `Tacky.hs`: TACKY IR generation\n- `Assembly.hs`: Assembly generation\n- `Pipeline.hs`: Full compilation pipeline\n- `Common.hs`: Shared test utilities\n\n\n## External Dependencies\n\nHalcyon relies on the following system tools:\n- **GCC**: For preprocessing C source files (`gcc -E`)\n- **Assembler**: For converting assembly to object files\n- **Linker**: For producing final executables\n\nMake sure these tools are installed and available in your system path.\n\n## Error Handling\n\nThe compiler provides detailed error reporting for:\n- Lexical errors (invalid characters, malformed numbers)\n- Syntax errors (invalid program structure)\n- Semantic errors (coming soon)\n- System errors (file I/O, external tool failures)\n\n## Future Plans\n\n### The Basics\n- [x] A minimal compiler\n- [x] Unary operators\n- [x] Binary operators\n- [ ] Logical and relational operators\n- [ ] Local variables\n- [ ] if statements and conditional expressions\n- [ ] Compound statements\n- [ ] Loops\n- [ ] Functions\n- [ ] File scope variable declarations and storage-class specifiers\n\n### Types Beyond Int\n- [ ] Long integers\n- [ ] Unsigned integers\n- [ ] Floating-point numbers\n- [ ] Pointers\n- [ ] Arrays and pointer arithmetic\n- [ ] Characters and strings\n- [ ] Supporting dynamic memory\n- [ ] Structures\n\n### Optimizations\n- [ ] Optimizing TACKY programs\n- [ ] Register Allocations\n\n## Contributing\n\nThis is a personal learning project following the book \"Writing a C Compiler\" by Nora Sandler. While it's not currently open for contributions, feel free to use it as a reference for your own compiler projects.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2F0xpantera%2Fhalcyon","html_url":"https://awesome.ecosyste.ms/projects/github.com%2F0xpantera%2Fhalcyon","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2F0xpantera%2Fhalcyon/lists"}