{"id":21339051,"url":"https://github.com/capsadmin/nattlua","last_synced_at":"2025-04-07T13:06:56.875Z","repository":{"id":38885378,"uuid":"192292560","full_name":"CapsAdmin/NattLua","owner":"CapsAdmin","description":"luajit with a typesystem","archived":false,"fork":false,"pushed_at":"2025-03-26T03:52:25.000Z","size":9423,"stargazers_count":96,"open_issues_count":4,"forks_count":3,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-03-31T11:06:29.662Z","etag":null,"topics":["analyzer","lua","luajit","typesystem"],"latest_commit_sha":null,"homepage":"","language":"Lua","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/CapsAdmin.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-06-17T07:02:44.000Z","updated_at":"2025-03-26T03:52:29.000Z","dependencies_parsed_at":"2024-05-18T01:21:47.308Z","dependency_job_id":"7c274e9f-b1ce-4cae-89d2-f11f9199905b","html_url":"https://github.com/CapsAdmin/NattLua","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CapsAdmin%2FNattLua","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CapsAdmin%2FNattLua/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CapsAdmin%2FNattLua/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CapsAdmin%2FNattLua/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/CapsAdmin","download_url":"https://codeload.github.com/CapsAdmin/NattLua/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247657281,"owners_count":20974345,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["analyzer","lua","luajit","typesystem"],"created_at":"2024-11-22T00:41:55.674Z","updated_at":"2025-04-07T13:06:56.857Z","avatar_url":"https://github.com/CapsAdmin.png","language":"Lua","readme":"# About\n\nNattLua is a superset of LuaJIT that introduces a typesystem. The typesystem aims to provide, by default, precise code analysis, while allowing you to optionally constrain variables with types.\n\nThe typesystem itself follows the same philosophy and feel as Lua; built on simple primitives that can be extended with type functions.\n\nThere is a [playground](https://capsadmin.github.io/NattLua/) you can try. It supports hover type information and other diagnostics.\n\nComplex type structures, such as array-like tables, map-like tables, metatables, and more are supported:\n\n```lua\nlocal list: {[number] = string | nil} = {} -- -1 index is alllowed\nlocal list: {[number] = string} | {} = {} -- same as the above, but expressed differently\nlocal list: {[1..inf] = string | nil} = {} -- only 1..inf index is allowed\n\nlocal map: {[string] = string | nil} = {} -- any string index is allowed\nlocal map: {foo = string, bar = string} = {foo = \"hello\", bar = \"world\"} -- only foo and bar is allowed as keys, but value can be of any string type\n\nlocal a = \"fo\" -- a is literally \"fo\", and not string, because we don't specify a contract\nlocal b = string.char(string.byte(\"o\")) -- these are type functions that take in literal and non literal types\nlocal map = {}\nmap[a..b] = \"hello\"\n-- this print call is a typesystem call, this will be ommitted when transpiling back to LuaJIT\nprint\u003c|map|\u003e -- \u003e\u003e {foo = \"hello\"}\n```\n\n```lua\nlocal Vec3 = {}\nVec3.__index = Vec3\n\n-- give the type a friendly name for diagnostics\ntype Vec3.@Name = \"Vector\"\n\n-- define the type of the first argument in setmetatable\ntype Vec3.@Self = {\n    x = number,\n    y = number,\n    z = number,\n}\n\nfunction Vec3.__add(a: Vec3, b: Vec3)\n    return Vec3(a.x + b.x, a.y + b.y, a.z + b.z)\nend\n\nsetmetatable(Vec3, {\n    __call = function(_, x: number, y: number, z: number)\n        return setmetatable({x=x,y=y,z=z}, Vec3)\n    end\n})\n\nlocal new_vector = Vector(1,2,3) + Vector(100,100,100) -- OK\n```\n\nIt aims to be compatible with LuaJIT as a frst class citizen, but also 5.1, 5.2, 5.3, 5.4 and Garry's Mod Lua (a variant of Lua 5.1).\n\nThe `build_output.lua` file is a bundle of this project that can be required in your project. It also should work in garry's mod.\n\n# Code analysis and typesystem\n\nThe analyzer works by evaluating the syntax tree. It runs similar to how Lua runs, but on a more general level, and can take take multiple branches if its not sure about if conditions, loops and so on. If everything is known about a program and you didn't add any types, you may get the actual output at type-check time.\n\n```lua\nlocal cfg = [[\n    name=Lua\n    cycle=123\n    debug=yes\n]]\n\nlocal function parse(str: ref string)\n    local tbl = {}\n    for key, val in str:gmatch(\"(%S-)=(.-)\\n\") do\n        tbl[key] = val\n    end\n    return tbl\nend\n\nlocal tbl = parse(cfg)\nprint\u003c|tbl|\u003e\n\u003e\u003e\n--[[\n{\n    \"name\" = \"Lua\",\n    \"cycle\" = \"123\",\n    \"debug\" = \"yes\"\n}\n]]\n```\n\nThe `ref` keyword means that the `cfg` variable should be passed in as a type reference. This is similar to how type arguments in a generic function is passed to the function itself. If we removed the `ref` keyword, the output of the function is be inferred to be `{ string = string }` because `str` would become a non literal string.\n\nWe can also add a return type to `parse` by writing `parse(str: ref string): {[string] = string}`, but if you don't it will be inferred.\n\nWhen the analyzer detects an error, it will try to recover from the error and continue. For example:\n\n```lua\nlocal obj: nil | (function(): number)\nlocal x = obj()\nlocal y = x + 1\n```\n\nThis code will report an error about potentially calling a nil value. Internally the analyzer would duplicate the current state, remove nil from the union `nil | (function(): number)` and continue.\n\n# Current status and goals\n\nMy long term goal is to develop a capable language to use for my other projects (such as [goluwa](https://github.com/CapsAdmin/goluwa)).\n\nAt the moment I focus strongly on type inference correctness, adding tests and keeping the codebase maintainable.\n\nI'm also in the middle of bootstrapping the project with comment types. So far the lexer part of the project and some other parts are typed and is part of the test suite.\n\n# Types\n\nFundamentally the typesystem consists of number, string, table, function, symbol, union, tuple and any. Tuples and unions exist only in the typesystem. Symbols are things like true, false, nil, etc.\n\nThese types can also be literals, so as a showcase example we can describe the fundamental types like this:\n\n```lua\nlocal type Boolean = true | false\nlocal type Number = -inf .. inf | nan\nlocal type String = $\".*\"\nlocal type Any = Number | Boolean | String | nil\n\n-- nil cannot be a key in tables\nlocal type Table = { [exclude\u003c|Any, nil|\u003e | self] = Any | self }\n\n-- extend the Any type to also include Table\ntype Any = Any | Table\n\n-- CurrentType is a type function that lets us get the reference to the current type we're constructing\nlocal type Function = function=(...Any | CurrentType\u003c|\"function\"|\u003e)\u003e(...Any | CurrentType\u003c|\"function\"|\u003e)\n\n-- extend the Any type to also include Function\ntype Any = Any | Function\n```\n\nSo here all the PascalCase types should have semantically the same meaning as their lowercase counter parts.\n\n# Numbers\n\nFrom narrow to wide\n\n```lua\ntype N = 1\n\nlocal foo: N = 1\nlocal foo: N = 2\n      ^^^: 2 is not a subset of 1\n```\n\n```lua\ntype N = 1 .. 10\n\nlocal foo: N = 1\nlocal foo: N = 4\nlocal foo: N = 11\n      ^^^: 11 is not a subset of 1 .. 10\n```\n\n```lua\ntype N = 1 .. inf\n\nlocal foo: N = 1\nlocal bar: N = 2\nlocal faz: N = -1\n      ^^^: -1 is not a subset of 1 .. inf\n```\n\n```lua\ntype N = -inf .. inf\n\nlocal foo: N = 0\nlocal bar: N = 200\nlocal faz: N = -10\nlocal qux: N = 0/0\n      ^^^: nan is not a subset of -inf .. inf\n```\n\nThe logical progression is to define N as `-inf .. inf | nan` but that has semantically the same meaning as `number`\n\n# Strings\n\nStrings can be defined as lua string patterns to constrain them:\n\n```lua\nlocal type MyString = $\"FOO_.-\"\n\nlocal a: MyString = \"FOO_BAR\"\nlocal b: MyString = \"lol\"\n                    ^^^^^ : the pattern failed to match\n```\n\nA narrow value:\n\n```lua\ntype foo = \"foo\"\n```\n\nOr wide:\n\n```lua\ntype foo = string\n```\n\n`$\".-\"` is semantically the same as `string`\n\n# Tables\n\nare similar to lua tables, where its key and value can be any type.\n\nthe only special syntax is `self` which is used for self referencing types\n\nhere are some natural ways to define a table:\n\n```lua\nlocal type MyTable = {\n    foo = boolean,\n    bar = string,\n}\n\nlocal type MyTable = {\n    [\"foo\"] = boolean,\n    [number] = string,\n}\n\nlocal type MyTable = {\n    [\"foo\"] = boolean,\n    [number] = string,\n    faz = {\n        [any] = any\n    }\n}\n```\n\n# Unions\n\nA Union is a type separated by `|` I feel these tend to show up in uncertain conditions.\n\nFor example this case:\n\n```lua\nlocal x = 0\n-- x is 0 here\n\nif math.random() \u003e 0.5 then\n    -- x is 0 here\n    x = 1\n    -- x is 1 here\nelse\n    -- x is 0 here\n    x = 2\n    -- x is 2 here\nend\n\n-- x is 1 | 2 here\n```\n\nThis happens because `math.random()` returns `number` and `number \u003e 0.5` is `true | false`.\n\nOne of these if blocks must execute, so that's why we end up with `1 | 2` instead of `0 | 1 | 2`.\n\n```lua\nlocal x = 0\n-- x is 0 here\nif true then\n    x = 1\n    -- x is 1 here\nend\n-- x is still 1 here because the mutation = 1 occured in a certain branch\n```\n\nThis happens because `true` is true as opposed to `true | false` and so there's no uncertainty in executing the if block.\n\n# Analyzer functions\n\nAnalyzer functions help us bind advanced type functions to the analyzer. We can for example define math.ceil and a print function like this:\n\n```lua\nanalyzer function print(...)\n    print(...)\nend\n\nanalyzer function math.floor(T: number)\n    if T:IsLiteral() then\n        return types.Number(math.floor(T:GetData()))\n    end\n\n    return types.Number()\nend\n\nlocal x = math.floor(5.5)\nprint\u003c|x|\u003e\n--\u003e\u003e 5\n```\n\nWhen transpiled to lua, the result is just:\n\n```lua\nlocal x = math.floor(5.5)\n```\n\nSo analyzer functions only exist when analyzing. The body of these functions are not analyzed like the rest of the code. For example, if this project was written in Python the contents of the analyzer functions would be written in Python as well.\n\nThey exist to provide a way to define advanced custom types and functions that cannot easily be made into a normal type function.\n\n# Type functions\n\nType functions is the recommended way to write type functions. We can define an assertion function like this:\n\n```lua\nlocal function assert_whole_number\u003c|T: number|\u003e\n    assert(math.floor(T) == T, \"Expected whole number\")\nend\n\nlocal x = assert_whole_number\u003c|5.5|\u003e\n          ^^^^^^^^^^^^^^^^^^^\nExpected whole number\n```\n\n`\u003c|` `|\u003e` here means that we are writing a type function that only exist in the type system. Unlike `analyzer` functions, its content is actually analyzed.\n\nWhen the code above is transpiled to lua, the result is still just:\n\n```lua\nlocal x = 5.5\n```\n\n`\u003c|a,b,c|\u003e` is the way to call type functions. In other languages it tends to be `\u003ca,b,c\u003e` but I chose this syntax to avoid conflicts with the `\u003c` and `\u003e` comparison operators. This syntax may change in the future.\n\n```lua\nlocal function Array\u003c|T: any, L: number|\u003e\n    return {[1..L] = T}\nend\n\nlocal list: Array\u003c|number, 3|\u003e = {1, 2, 3, 4}\n                                 ^^^^^^^^^^^^: 4 is not a subset of 1..3\n```\n\nIn type functions, the type is by default passed by reference. So `T: any` does not meant that T will be any. It just means that T is allowed to be anything.\n\nIn Typescript it would be something like\n\n```ts\ntype Array\u003cT extends any, length extends number\u003e = {[key: 1..length]: T} // assuming typescript supports number ranges\n```\n\nType function arguments always need to be explicitly typed.\n\n# More examples\n\n## List type\n\n```lua\nfunction List\u003c|T: any|\u003e\n\treturn {[1..inf] = T | nil}\nend\n\nlocal names: List\u003c|string|\u003e = {} -- the | nil above is required to allow nil values, or an empty table in this case\nnames[1] = \"foo\"\nnames[2] = \"bar\"\nnames[-1] = \"faz\"\n^^^^^^^^^: -1 is not a subset of 1 .. inf\n```\n\n## ffi.cdef parse errors to type errors\n\nffi functions including cdef are already typed, but to showcase how we might throw parsing errors to the type system we can do the following:\n\n```lua\nanalyzer function ffi.cdef(c_declaration: string)\n    -- this requires using analyzer functions\n\n    if c_declaration:IsLiteral() then\n        local ffi = require(\"ffi\")\n        ffi.cdef(c_declaration:GetData()) -- if this function throws it's propagated up to the compiler as an error\n    end\nend\n\nffi.cdef(\"bad c declaration\")\n```\n\n```lua\n4 | d\n5 | end\n6 |\n8 | ffi.cdef(\"bad c declaration\")\n    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n-\u003e | test.lua:8:0 : declaration specifier expected near 'bad'\n```\n\n## `load` evaluation\n\n```lua\nlocal function build_summary_function(tbl)\n    local lua = {}\n    table.insert(lua, \"local sum = 0\")\n    table.insert(lua, \"for i = \" .. tbl.init .. \", \" .. tbl.max .. \" do\")\n    table.insert(lua, tbl.body)\n    table.insert(lua, \"end\")\n    table.insert(lua, \"return sum\")\n    return load(table.concat(lua, \"\\n\"), tbl.name)\nend\n\nlocal func = build_summary_function({\n    name = \"myfunc\",\n    init = 1,\n    max = 10,\n    body = \"sum = sum + i !!ManuallyInsertedSyntaxError!!\"\n})\n```\n\n```lua\n----------------------------------------------------------------------------------------------------\n    4 | )\n    5 |  table.insert(lua, \"end\")\n    6 |  table.insert(lua, \"return sum\")\n    8 |  return load(table.concat(lua, \"\\n\"))\n                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n    9 | end\n10 |\n----------------------------------------------------------------------------------------------------\n-\u003e | test.lua:8:8\n    ----------------------------------------------------------------------------------------------------\n    1 | local sum = 0\n    2 | for i = 1, 10 do\n    3 | sum = sum + i !!ManuallyInsertedSyntaxError!!\n                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n    4 | end\n    5 | return sum\n    ----------------------------------------------------------------------------------------------------\n    -\u003e | myfunc:3:14 : expected assignment or call expression got ❲symbol❳ (❲!❳)\n```\n\nThis works because there is no uncertainty about the code generated passed to the load function. If we did `body = \"sum = sum + 1\" .. (unknown_global as string)`, that would make the table itself become uncertain so that table.concat would return `string` and not the actual results of the concatenation.\n\n## anagram proof\n\n```lua\nlocal bytes = {}\nfor i,v in ipairs({\n    \"P\", \"S\", \"E\", \"L\", \"E\",\n}) do\n    bytes[i] = string.byte(v)\nend\nlocal all_letters = _ as bytes[number] ~ nil -- remove nil from the union\nlocal anagram = string.char(all_letters, all_letters, all_letters, all_letters, all_letters)\n\nprint\u003c|anagram|\u003e -- \u003e\u003e \"EEEEE\" | \"EEEEL\" | \"EEEEP\" | \"EEEES\" | \"EEELE\" | \"EEELL\" | ...\nassert(anagram == \"SLEEP\")\nprint\u003c|anagram|\u003e -- \u003e\u003e \"SLEEP\"\n```\n\nThis is true because `anagram` becomes a union of all possible letter combinations which contains the string \"SLEEP\".\n\nIt's also false as it contains all the other combinations, but since we use assert to check the result at runtime, it will silently \"error\" and mutate anagram to become \"SLEEP\" after the assertion.\n\nIf we did assert\u003c|anagram == \"SLEEP\"|\u003e (a type call) it would error, because the typesystem operates more literally.\n\n# Parsing and transpiling\n\nAs a learning experience I wrote the lexer and parser trying not to look at existing Lua parsers, but this makes it different in some ways. The syntax errors it can report are not standard and are bit more detailed. It's also written in a way to be easily extendable for new syntax.\n\n- Syntax errors can be nicer than standard Lua parsers. Errors are reported with character ranges.\n- The lexer and parser can continue after encountering an error, which is useful for editor integration.\n- Whitespace can be preserved if needed\n- Both single-line C comments (from GLua) and the Lua 5.4 division operator can be used in the same source file.\n- Transpiles bitwise operators, integer division, \\_ENV, etc down to valid LuaJIT code.\n- Supports inline importing via require, loadfile, and dofile.\n- Supports teal syntax, but does not currently support its scoping rules.\n\nI have not fully decided the syntax for the language and runtime semantics for lua 5.3/4 features. But I feel this is more of a detail that can easily be changed later.\n\n# Development\n\nTo run tests run `luajit nattlua.lua test`\n\nTo build run `luajit nattlua.lua build`\n\nTo format the codebase with NattLua run `luajit nattlua.lua fmt`\n\nTo build vscode extension run `luajit nattlua.lua build-vscode`\n\nTo install run `luajit nattlua.lua install`\n\nIf you install you'd get the binary `nattlua` which behaves the same as `luajit nattlua.lua ...`\n\nI've setup vscode to run the task `onsave` when a file is saved with the plugin `gruntfuggly.triggertaskonsave`. This runs `on_editor_save.lua` which has some logic to choose which files to run when modifying project.\n\nI also locally have a file called `test_focus.nlua` in root which will override the test suite when the file is not empty. This makes it easier to debug specific tests and code.\n\nSome debug language features are:\n\n`§` followed by lua code. This invokes the analyzer so you can inspect or modify its state.\n\n```lua\nlocal x = 1337\n§print(env.runtime.x:GetUpvalue())\n§print(analyzer:GetScope())\n```\n\n`£` followed by lua code. This invokes the parser so you can inspect or modify its state.\n\n```lua\nlocal x = 1337\n£print(parser.current_statement)\n```\n\n# Similar projects\n\n[Teal](https://github.com/teal-language/tl) is a language similar to this which has a more pragmatic approach. I'm thinking a nice goal is that I can contribute what I've learned here, be it through tests or other things.\n\n[Luau](https://github.com/Roblox/luau) is another project similar to this, but I have not looked so much into it yet.\n\n[sumneko lua](https://github.com/sumneko/lua-language-server) a language server for lua that supports analyzing lua code. It a typesystem that can be controlled by using comments.\n\n[EmmyLua](https://github.com/EmmyLua/VSCode-EmmyLua) Similar to sumneko lua.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcapsadmin%2Fnattlua","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcapsadmin%2Fnattlua","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcapsadmin%2Fnattlua/lists"}