{"id":42985009,"url":"https://github.com/bkomuves/nanohs","last_synced_at":"2026-01-31T02:16:46.077Z","repository":{"id":117072424,"uuid":"333239183","full_name":"bkomuves/nanohs","owner":"bkomuves","description":"a self-hosting lambda calculus compiler","archived":false,"fork":false,"pushed_at":"2025-03-31T19:01:20.000Z","size":877,"stargazers_count":35,"open_issues_count":0,"forks_count":3,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-03-31T20:21:41.635Z","etag":null,"topics":["compiler","haskell","programming-language"],"latest_commit_sha":null,"homepage":"","language":"Haskell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bkomuves.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-01-26T22:53:01.000Z","updated_at":"2025-03-31T19:01:24.000Z","dependencies_parsed_at":null,"dependency_job_id":"1b39723f-69b3-44c7-836b-7f8057447b93","html_url":"https://github.com/bkomuves/nanohs","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/bkomuves/nanohs","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bkomuves%2Fnanohs","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bkomuves%2Fnanohs/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bkomuves%2Fnanohs/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bkomuves%2Fnanohs/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bkomuves","download_url":"https://codeload.github.com/bkomuves/nanohs/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bkomuves%2Fnanohs/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28926651,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-30T22:32:35.345Z","status":"online","status_checked_at":"2026-01-31T02:00:09.179Z","response_time":128,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["compiler","haskell","programming-language"],"created_at":"2026-01-31T02:16:45.956Z","updated_at":"2026-01-31T02:16:46.065Z","avatar_url":"https://github.com/bkomuves.png","language":"Haskell","funding_links":[],"categories":[],"sub_categories":[],"readme":"\nNanoHaskell: a self-hosting lambda calculus compiler\n====================================================\n\nThe goal of this experiment is to create a self-hosting lambda calculus\ncompiler (and interpreter) in a minimal amount of Haskell-style code.\n\nThe language is (strict) lambda calculus + data constructors + simple\npattern matching + recursive lets + IO effects. The syntax is chosen so that \na program can be also a valid Haskell program at the same time (this makes \ndevelopment much easier).\n\nHaskell features like type signatures, data type declarations and imports\nare parsed (well, recognized...), but then ignored.\n\n\nCurrent status\n--------------\n\n* it compiles via GHC, both with and without optimizations\n* it self-hosts, both with and without optimizations\n* it needs a large C stack (32+ Mb) + GCC optims (because of the lack of tail call elimination)\n* source code: about 2000 \"essential\" lines + 560 lines of type annotations; the C runtime is \\~650 lines\n  (including some debugging features)\n* the interpreter is not working 100% correctly at the moment\n\n\nUsage\n-----\n\n    $ nanohs -c input.hs output.c            # compile with optimizations disabled\n    $ nanhos -o input.hs output.c            # compile with optimizations enabled\n    $ nanhos -i input.hs [arg1 [arg2 ...]]   # interpret\n\nOr you can just use `runghc`:\n\n    $ runghc Nano.hs -c examples/church.nano tmp.c ; gcc tmp.c ; ./a.out\n\n\n### Imports\n\nHaskell imports are ignored, but you can use C-style includes with the pragma:\n\n    {-% include \"othermodule.hs\" %-}\n\n\nThe surface language\n--------------------\n\nThe idea is to use a subset of Haskell syntax, so that the same\nprogram can also be compiled / interpreted by GHC.\n\n* no static type system (untyped lambda calculus) - but maybe there should be a type checker after all?\n* no data type declarations (constructors are arbitrary capitalized names)\n* no module system - instead, C-style includes\n* strict language (if-then-else must be lazy though; `and` / `or` shortcuts too)\n* ML-style side effects (but only used for IO, which is then wrapped into a monad)\n* only simple pattern matching + default branch (TODO: nested patterns)\n* no infix operators\n* list construction syntax `[a,b,c]` is supported\n* no indentation syntax (only curly braces), except for top-level blocks\n* only line comments, starting at the beginning of the line\n* built-in data types: `Int`, `Char`, `Bool`, `List`, `Maybe`, etc - those required by the primops\n* universal polymorphic equality comparison primop (?)\n* no escaping in character / string constants (TODO: maybe it's worth to do escaping?)\n* basic IO: standard input / output, basic file handling, early exit, command line arguments \n\nWe can make the same source files to be accepted by both GHC and\nitself by recognizing and ignoring GHC-specific lines (pragmas, imports,\ntype signatures, datatype declarations, type synonyms). We just put\nprimops implementations into a PrimGHC module (as imports are ignored).\n\nWe could in theory have several backends:\n\n* C99 on 64 bit architectures\n* TODO: x86-64 assembly\n* TODO: some simple bytecode virtual machine\n\nFor bootstrapping philosophy it seems to be useful to have a very simple virtual \nmachine, for which an interpreter can be very easily written in C or any other \nlanguage.\n\n\nCompilation pipeline\n--------------------\n\n1. lexer\n2. parser\n3. partition recursive lets using dependency analysis\n4. recognize primops\n5. TODO: eliminate pattern matching into simple branching on constructors\n6. collect data constructors\n7. scope checking \u0026 conversion to core language\n8. inline small functions + beta reduce + eliminate unused lets\n9. closure conversion\n10. TODO: compile to some low-level intermediate language\n11. final code generation (TODO: different backends)\n\n\nRuntime system\n--------------\n\nThere is an \"enviroment\" stack separate from the CPU stack. This makes it\nvery easy to find GC roots: just walk through the stack. The stack contains\npointers to the heap (with the optimization that small heap objects, fitting\ninto 61 bits, are not actually allocated).\n\nOn the heap there are only two kind of objects, closures and data constructors:\n \n* data constructor (needs: tag + arity)\n* closure / partial application (needs: static function pointer / index, \n  number of applied arguments, number of remaining arguments)\n\nHeap pointers are also tagged:\n\n* normal heap pointer (closure or data constructor)\n* 61 bit literals (int / char)\n* nullary constructors\n* static function pointers\n* foreign pointers (used for file handles in the C runtime)\n\nThere are no thunks on the heap because we are strict.\n\nThe garbage collector is a very simple copying (compacting) GC.\n\n\nImplementation details\n----------------------\n\nThere are some minor tricks you should be aware of if you try to read the code.\n\n### Argument order\n\nThe order of function arguments on the stack, the captured variables in closures \nand also the order of constructor arguments on heap are all reversed compared to \nthe \"logical\" (source code) order. This makes the implementation of application\nmuch simpler.\n\n    [ Cons_tag    argN ... arg2 arg1 ]                   # data constructor heap object\n    [ Closure_tag envK ... env2 env1 ]                   # closure heap object \n    [ ... | argN ... arg1 envK ... env1 | undefined ]    # stack when calling a static function\n          ^ BP                          ^ SP\n\nNote: our stack grows \"upwards\" (unlike the CPU stack which grows \"downwards\").\n\n### IO monad\n\nThere is an IO monad, which in the GHC runtime and the interpreted runtime is\nthe host's IO monad, while in the compiled code it is encoded with functions\nhaving side effects:\n\n    type IO a = ActionToken -\u003e a\n\nYou need to begin your `main` function with an explicit `runIO` call (this is\nuseful while debugging, as main can be just a simple expression instead).\n\n\nOrganization of source code\n---------------------------\n\n    Base.hs          - base library / prelude\n    Closure.hs       - closure conversion\n    CodeGen.hs       - code generation\n    Containers.hs    - container data structures \n    Core.hs          - core language\n    DataCon.hs       - data constructors\n    Dependency.hs    - reordering lets using the dependency graph\n    Eval.hs          - interpreter\n    Inliner.hs       - inliner + basic optimizations\n    Nano.hs          - main executable\n    Parser.hs        - parser\n    PrimGHC.hs       - the primops implemented in Haskell (so that GHC can host it) \n    PrimOps.hs       - primops\n    ScopeCheck.hs    - scope checking + conversion to core\n    Syntax.hs        - surface syntax\n    Types.hs         - common types\n    rts.c            - the runtime system implemented in C\n    bootstrap.sh     - shell script to bootstrap the compiler\n    sloc_count.sh    - shell script to measure source code size\n \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbkomuves%2Fnanohs","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbkomuves%2Fnanohs","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbkomuves%2Fnanohs/lists"}